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BiMlEUC MARKERS FOR US£ IN C0NSTBUC7ING A HIGH DENSITY DISEQUIUORIUM MAP OF THE HUMAN GENOME 



Bacfcareund of the Invention 

Recent advances in genetic engtnccrino and btoinfonnatics have enabled ttic manipulation and characterization 
5 of large ponions of the human gcnomo. While efforts to obtain the full sequence of the human genome arc raptdfy 

proorcsstng, there are many practical uses for genetic tnfomution which con be implemented wiili partial knowledge of 
tho sequence of the human gcoomei 

As the full sequence of the human genome is assembled, the panial sequence information available can be used 
to identify genes responsible for detectable human traits, such as genes associated with human diseases, and to develop 
10 diagnostic tests capable of identifying intfividuals wlio eiprass a detectable trait as the result of a specific genotype or 

indhfiihtats whose genotype places them at risk of deveioptng a detectable trait at a subsequent time. Eacii of these 
appHcattons for partial genomic sequence infomiotion is based upon the assembly of genetic and physical maps which 
order the known genomic sequences along the human chromosomes. 

The present invention relates to human genomic sequences which can be used to construct a high resolution 
15 map of the human genome, methods for constructing such a map, methods of identifying genes associated with 

detectable human traits, and diagnostics for identifying imfittduals who carry a gene which causes them to eiprcss a 
detectable trait or which places them et risk of expressing a detectable trait in the future. 

Summary of the invention 

20 A first embodimimt of the present mvention is a method of obtaining a set of bialteCc markers comprising the 

steps of obtaining a nucleic acid Ufarary comprising a pluraOty of genomic DNA fragments comprising the full ociiome or a 
portion thereof, determming the order of said plurality of genomic ONA fragments in the genome, determining the 
sequence of selected regions of said phirality of genomic DNA fragments, and identifying nucleotides in said plurality of 
genomic DNA fragments which very between individuals, thereby defining a set of biatleUc markers. 
25 bi one aspect of this fvst embodiment the identifying step comprises identify'mg about 20,000 btattelic 

markers. In another aspect of this first embodiment the identifying step comprises identifying about 40,000 bialtelic 
markers. In a further aspect of tMs cmbodiroent tho iitenttfyvig step comprises identtfying about 60,000 biailBfic 
markers. In still another aspect of this first embodbnent, the tdentifying stop comprises identifying about 60,000 
bhMK markers. . In sti another aspect of this first embadunent the identifying step comprises identifying about 
30 100,000 blallclic markers. . In ttid arwtber aspect of this first erttbodimen the tdentifying step comprises identifying 
about 120,000 tnallelic markers. 

In sttH another aspect of this first embodiment, the biaUelic markers are separated from one another by an 
average distance of IOkb-200 kb. . hi still another asp^t of thb ftrst embodiment the faraftefic markers are separated 
from one anothar by an average tfistince of 15kb-150 kb. In stiO another aspect of this first emboifimcnt the biailelic 
35 markers art separated from one another by an average distance of 20kb-100 kb. . In stiU another aspect of this first 
embodiment thelitaDeDc markers are separated Irom one another by an average distance of lOOkb^n^O kb. In still 



wo 99/04038 



PCT/IB98/01193 



•2. 

another aspect of this first embodiment the biallelic markers are separated from one another by an average distance of 
SO-IOOkb. . In stttl another aspect of this first embodiment, the biadclic markers are separated from one another by an 
amage distance of 25 kb-5Q kb. 

In still another aspoct of this ftrst embodimL>rtt. the step of determining the sequence of selected regions of 
5 said plurality of genomic DNA fragments comprises inserting fragments of said piurafity of gcnomjc DMA fragments into 
a vector to Qcmm a plurality of subclones and determining the sequence of a region of the inserts in seid pliinility of 
subclones or a subsot thereof. For uampic, in this aspect of (he first embodiment, the step of dctcnnining the sequence 
of a region of said inserts or a subset thereof may comprise determining the sequence uf one or both end regions of said 
inserts or a sobsct tttereof. In this itpect of tfic Hrst embodiment, the step of detcnnining the sequence of one or both 

10 end regions of said plurality of subclones comprises dctcrmtning the sequence of about 500 bases et each end of said 
subclones or a subset thivcof. 

tn stilt another aspect of this first embodiment a set of about 10,000 to about 20.000 genomic DNA inserts 
with an average size between IQQkb and 300kb are ordered. In still another aspect of this first embodiment, a set of 
about 10,000 to about 30.000 genomic DNA inserts with ao overage size between 100kb and 150 kb are ordered. In 

15 still another aspect of this first embodiment, a set of about 15,000 to about 25,000 genomic DMA inserts with an 
average size between IQOkfa and 200 kb are ordered. 

In still another aspect of this first embodiment the identifying step comprises identifying between 1 and 6 
biaOalic markers per genomic DNA fragment tn still another aspect of this first embodiment, the identifying step 
comprtses identifying an average of 3 biallelic markers per genomic DNA insert 

20 In still another aspect of this first embodiment the gcncmic ONA fragments are in a Bacteria) Artificial 

Chromosome. In still another aspect of this ftrst embodiment the genomic QNA fragments are in a Yeast Artificial 
Chromosome. 

In still another aspect of this first embodiment the method f unher comprises determining the position of said 
biaOefic markers along the genome or a portion thereof. In this aspect of the first embodiment, the step of determining 
25 the positum of said bialleCc markers along the genome or portion thereof may comprise determining the position of said 
biallelic markers atong a chrovtiosome. In thts aspect of the first embodiment, the step of determining the posHion of 
said biatteKe markers along the genome or portion thereof comprises determtnmg the posrtion of said biallaltc markers 
along a subchromosomal region. 

to still another aspect of this first embodiment the method further comprises identifying biallelic markers 
30 which are in linkage diseqinlibrimt with one another. In this espect of the first embodiment the method may further 
comprise tiptimoiitg the intermarker spacing between said biailelic markers such that each identified marker is in linkage 
(fiseqoillilifium with at least one other identified marker. 

In s^l another aspect of this first embodiment, the portion of the genome comprises at least 200 kb of 
contfguoufi genomic DHA. In still another aspect of this ftrst embodiment the portitm of the genome comprises at least 
35 300 kb of contiguous genomic DNA. hi st31 another asptct of this first embodiment the portion of the genome 
comptises et least of contiguotts genomic DYIA. In still another aspect ol Ous first embodiment, the portion of the 
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genome comprises at least 2 Mb of contiguous oenomtc DMA. In still another aspect of this first embodiment, the portion 
of the genome comprises at least 5 Mb of contiguous genomic ONA. In still another aspect of this first embodiment, the 
ponion of the gcnomo comprises at least 10 Mb of contiguous genomic DMA. In still another aspect of this first 
embodiment, the portion of the Qcnome comprises ot least 20 Mb of contiguous genomic ONA. 
5 In suH anniher aspect of this first embodiment, the method further comprises the step uf identifying one or 

more groups of biallcfic markers which arc in proximity to one another in the genome. In this aspect of the first 
embodiment the btattetic markers m «ach of Uwse groups may be located within a genoinlc region spanning less than 
Ikb. Alternatively, in this aspect of (lie first embodiment, the biattciic markers in each of these groups may be located 
within a genomic region spanning from 1 to 5kb. Altcmativcly, in this aspect of the first embodiment the biallclic markers 
10 in each of these groups may be located within a genomic region spanmng from 5 to lOkb. Alternatively, in this aspect of 
the first embodiment the biallelic maikers in each of these groups may be located wftliin a genomic region spanning from 
10 to 25kb. Alternatively, in Diis aspect of the first embodiment the biallelic markers in each of these groups nnay be 
located within a genomic region spaaninQ from 25 to 50kb. Attcmativcty, in this aspect of the first embodiment, the 
biafleBc markers in each of these groups may be located within a genomic region spanning from 50 io 150kb. 
15 AttGmattvcly. in this aspect of the first cmbodimem. the biallelic markers in each of these groups may be located within a 

genomic region spanning from 150 to 25Qkb. Altemativcly, in tliis aspect of the first embodiment the biallelic markers in 
oach of these groups may be located within a genomic region spanning from 250 to SOOkb. Alternatively, in tltts aspect of 
tho first embodiment the bialtcOc markers in each of these groups may be located within a genomic region spanmng from 
SOOkb to 1Mb. Attematively, in this aspect of the first embodiment, the biallelic markers in each of these groups may be 
20 located within a genomic region spanning more than 1 Mb. 

A second embodiment of the present invention is a method of obtaining a set of biallelic markers comprising the 
steps of obtaining a nucleic acid library compris'mg genomic QNA fragments comprising the full genome or a portion 
thereof, delermining the sequence of selected regions of said genomic DMA fragments, identifying nucleotides in said 
genomic ONA fragments which vary between individuals, thereby defining a set of biallelic markers, and 
25 determining the order of said bialleSc markers along the genome or portion thereof. 

A th'rd emboifimeftt of the present mventiQn is a set of biaUetic nnarkers obtained fay the ntethod of the first 
embodinent In one aspect of this third embodiment the markers \n said set have a known genomic positioa In another 
aspect of thb tlurd embodinent the ntarkers in said set have a known genomic relationship to one another. 

A fourth embodiment of the present invention is a set of biaQeftc markers having a known relationship to one 
30 anottver and a known genomic position said set of biallelic markers being obtained by the method of the first 
embodiment In one aspect of this fourth mbodiment the bialle&c markers have heterozygosity rates of at least about 
0.16. In another aspen of this fourth embodiment the biallelic markers have heterozygosity rate of at least about 0.32. 
In still another aspect of this fourth ennbodiment the biafle6c markers have a heterozygosity rate of at least about O.AZ 
A fifth emboi&nent of the present invention is a map compnstng an ortlsred array of at least 20.000 biallelic 
35 markers obtained by the method of the first efflbodimeaL In one aspect of this fifth embodiment the map comprises an 
ordered array of et least 60,000 biallelic markers obtained by the method of the first embodiment In another aspect of 
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this fifth embadiment the map compmes an orde/ed anay of at least 120,000 biallclic markers obtained by the method 
at tbfi first isnbodimant 

tn enother aspect of this fifth cmbodimont biaHcUc markers are distributed at an average marker density ot 
one maker every ISOkb. In a further aspect of this fifth embodiment the bianclic markers are distributed at nn average 

5 marker density of one marker every 50 kb. tn a further aspect of this fifth embodiment, the biallefic markers are 

distribotcd at an sveFsge marker density of one marker ovcry 25 kb, 

A sixth embodiment of the present invention is a mcttiod of identifying one or more biaflelic markers associated 
with a detectable trait comprising the steps of determiningthc frequencies of each allele of one or more biaflelic 
markers obtained by the method of tho first cmbodimcat in individuals who expfozs said detectable trait and individuals 

10 who do not eipross said detectable trait and identifying one or more alleles of said one or mora biaUelic markers which 
are stattsticalty associated with the expression of said delectable trait. In one aspect of this sixth embodiment the 
detectable trait is selected from the group consisting of disease, druQ response, drug efficacy, and drug toxicity. )n 
another aspect of this sixth embodiment the phenotypc of said individuals who express said detectable trait and the 
phenotype of said individuals who do not express said detectable trait are readily distinguishable from one another. In 

1 5 stin another aspect of this sixth embotfiment the individuals who cipress said detectable trait and the indhrtduafs who do 
not express said detectable trait are selected (rom a bimodal phenotypc distribution. In still another aspect of this siitli 
embodiment, the vtdividuals who express said detectable trait are at one phcnotypic extreme of the population and said 
individuaU who do not expfoss said detectable trait arc at the other phenotypic extreme of the population. 

A seventh embodiment of the present invention is a method of identifying a haplotype assodated with a trait 

20 comprising titc steps of obtaining nucleic add samples from trait positive and trait negative individuals, determining 

the frequencies of the alleles of each member of a group of biaflelic markers obtained by the method of the first 
embodiment which are known to be located proximity to one another in the genome in said nucleic acid samples, and 
identifying a pluratity of alleles of biaQeiic markers having a statistically significant association with said trait In one 
aspect of this seventh embodiment the detectable trait is selected from the group consisting of disease, dnig response, 

25 drug efficacy, and drug tondty. 

In mther aspect of this seventh embodiment, the biallenc markers in each of these groups are located within 
a genomic region spvming less than Ikb. In sti another espect of this seventh embodiment the faiaOdic markers in each 
of these groups are located within a genomic legion spanning from 1 to 5kb. In still another aspect of this seventh 
embodiment the IttaBeGc markers io each of these groups are located within a genomic region spanning from 5 to 1Qkb. , 

3D tn stil anmher aspect of this seventh eniboAnent the biaRefic markers in each of these groups are located within a 

genomic region spanning from 10 to 2Skb, . In still another aspect of this seventh embodiment the bialteiic markers in 
each of these groups are (ocated whhin a genomic tegton spanning from 25 to 50kbL In still another aspect of this seventh 
embodknent the biittelic markers in each of these groups are located within a genotwc reghtn spanning from 50 to 
ISQUl . In stSI another aspect of tins seventh embodimant the biallelic markers in each of these groups are located 

35 within a genomic reQton spanning from 150 to 2S0kb. In stil mother aspect of this seventh embodmtent the biallelic 
markers in each of these groups are located within a genonuc region spanning from 250 to SOOkb, hi stOi another aspect 
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of this seventh embodintcni the biallelic markers tn each of these groups are (ocaied within a genomic roglon spanning 
from SOOkta to 1Mb. In still another aspect of this seventh embodiment, the btalleflc markers in each of these groups are 
(ocated withm a genomic region spaiming more than 1MU 

An eighth embodiment of the present invention is a method of idenrifytng one or more hiailclic markers 
5 associated with a detectable trait comprising the steps of selecting a donB m which mutations result in a detectable trait 
or 8 gene suspected of being associated with a detectable trait and identifying one ot more biatlefic markers obtoinmt by 
the method of Claim 1 within the genomic region haiboring said gene which arc associated with said detcctiitjie trait. In 
one espect of this eightli embodiment, the detectable trait is selected from the group consisting of disease, drug 
response^ drug ef Hcacy, and drug toxicity. In another aspect of this eighth emfaodtmeiit. the identifymg step comprises 

10 determining the frequencies of said one or more bialtelic markers in indivrduals who express said detectable 

trait and indhnduais who do not express said delectable trait and identifying one or more biallclic markers which aic 
statistically associated with the expression of said detectable trait. 

A ninth embodinMnt of the present Invention is an array of nucleic acids fued to a support, said nucleic ocids 
comprising at least B consecutive nucleotides, including ttie polymorphic nucleotide, of one or more biallclic markers 

15 obtamed by the method of the Tm crobodimeflt In one aspect of this ninth embodiment, the nucleic acids comprise ai 
least 15 consecutive nuclootides, indading the polymorphic nucleotide, of at least f'rvc biallelic markers obtatnud by the 
method of the first embodiment tn another aspect of this ninth embodiment, 

the nucleic acids comprise at least 8 consecutive nucleotides, including The polymorphic nucleotide, of at least ten 

bialtelic markers obtained by the method of the first embodiment. 
20 A tenth embodiment of the present invention is an array of nucleic acids fixed to a support, said nucleic acids 

comprising at least 8 consecutive nucteotidcs. induding the polymorphic nucleotide, of one or more groups of biallclic 

markers known to be located in proximrty 10 one another hi the genome. 

An eteventh embodiment of the present invention is an array of nucleic acids fixed to a support, said nucleic 

acids comprising amplification primers for generating an amplification product comprising at least 8 consecutive 
25 nucleotides, inchfding the poJjrmoiphic nodeotide, of one or more biafleBc markers obtained by the method of tlie first 

ombodinienL 

A twelfth cmbodiRwn of the present tnvnetion is an array of nucleic acids fued to a support, said nucleic acids 
of comprising amplificatjon prtmtrs for generating an amplirication product comprising at least 15 consecutive 
nudeothles, mduding the pdynuuphtc nucleotide, of one or more groups of biallelic markers known to be located in 
30 proxmiity to one another in the genamoL 

A thirteenth ambodinieiit of the present invoetion is an irray of nuddc acids fixed to a support, said nucleic 
adds comprising one or more microsequcociog primers for determining the identity of the polymorphic base of one or 
more nudeic adds con^prisiitg at least 15 cmtsecutwa nucleotidesr induding the polymorphic nucleotide, of one or more 
btalldic markers obtained by the ntethod of the Tirst embodknent 
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A fourteenth embodiment of tha present invention is an array of nucleic acids fixed to a support, said nucleic 
nucleic acids comprtsmg one or more mtcrcscqucncing primers for dctcrntining the identity of the poiymorphic bases of 
one or more groups of biallciic markers known to be located in pronmity to one another in the Qenome. 

A fdtccnlh embodiment of the present Invention is an nn^ay of nucleic acids fixed to a support wherein said 
5 ntrclcic adds are comptemcotary to one or more microsequencing primers for determining (lie identities of the 
potymorpltic bases of una or more bialleQc markers obtained by tho method of the ftrst embodirnenL (n one aspect of 
this fifteenth embodiment, the nucleic acids are complementary to at least five nucrosequancing prbners for detefniining 
the identities of the polymorphic bases of at least five bialteKc markers obtained by the method of the itnx embodiment. 
In anotticr aspect of this fifteenth embodiment the oudcic acids are complementary to at least ten microscijucncing 
10 primers for determining the identities of the polymorphic bases of at least ten biaitelic markers obtained by the inctliod 
of the first embodiment. 

A sirteenth embodiment of tha present invention is an array of nucleic acids fixed to a siippori, said nucleic 
acids comprising one or more nucleic acids compfcmenfary to one or marc microsequencing primers for dctcnnmrng The 
identity of the polymorphic bases of one or more groups of biallelic markers known to be located in proximity to ouu 
15 another in ilic Qenome. 

Another aspect of the present invention is an arriy of any one of tlw tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein the merahers of each of said one or more groups of bialieOc markers are located in pliysital 
proximity to one another on aaid support . 

Another aspect of the present invention is an array of any one of Claims of the tenth, twelfth, fourteenth or 
20 sixteenth embodiments, wherein said bialfefic markers in each of these groups arc located within a genomic region 
spanning (ess thanUb. 

Another aspect of the present owention is on array of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein said biatlefic markers in each of these groups are located within a genomic region spanning from 1 
toSkb. 

25 Another aspect of the present invention ts an array of any one of of the tenth, twelfth, fourteenth or sixtccnili 

embodiments, wherein the blaQefic markers in each of these groups are located within a genomic region spanning from 5 
to tOkb. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein the biaOeftc markers in aach of these groups are located within a genomic region spanning from 
30 10to25kb. 

Another aspect of the present invention k an array of any one of of the tenth, twelfth, fourteenth or sineenth 
embodiments, wherein the biaQeGc markers in each of these groups are located within a genomic region spanning from 
25to50kh. 

Another aspect of the pmnt invention is an array of any one of o1 the temh, twelfth, fourteenth or shteenth 
35 embodiments, wherein the biallelic markers bi each of these groups are located within a genomic region spannmg from 
SOtolSOkb. 
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Another aspect of the present mventifin is an array of any one of of tho tenth, twelfth, fourteenth or sixteenth 
embodtmGnts* whemin the bia&etic markers in each of these groups arc located within a genomic region spanning from 
150 to ZSOkb. 

Another aspect of the present invention is nn array of any one of of the tenth, twelfth, fourteenth or sixteenth 
5 ombotTtraents. wherein the biaQcJic markers in each ol ttmse groups ate located within a genomic rcffion spanning from 
250to500kb. 

Another aspect of the present mventton is an array of any one of uf the tenth, twelfth, fourteenth or sizteenlli 
embotfimdntSi wherein tho twiUclic markers in each of these groups ere located within a genomic region spanning from 
SOOkbto 1Mb. 

10 Another aspect of the present mwrntion is an array of any one of of the tenth, twelfth, founoenth or siarteenih 

embethments* whcrnin the btallcGc markers in each of these groups are located within a genomic region sfianning more 
than 1Mb. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein each group of bianefic markera comprises at least 3 biallelic markers. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein each group of biaftcfic markers comprises at least 6 biallelic markers. 

Another aspect of the present invention is an array of ony one of of ilie tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein each group of biallelic markers comprises at least 20 bfallclic markers. 

A seventeenth embodiment of the present invention is a method for determining whether an indlvitlual is at risk 
20 of dcvelop'mg a dclcctabfe trait or suffers from a detectable trait associated with said trait comprising the steps of 
obtaining a nucleic acid lampfe from said individual, screening said nucleic acid sample with one or more biallelic markers 
obtained by the method of the first embodiment, and determining whether said nucleic acid sample contains one or morn 
of biaftedc markers statistiwily associated with said detectable trait I one aspect ol this seventeemh embodiment, tho 
detectable Uait is selected from the group consisting of disease, drug response, drug efficacy and drug toxicity. In 
25 onother aspect of this seventeenth emobimeni the biatlelic markers were obtained by the method of the sixth 
embodiment In another aspat of this seventeenth embodiment, the biallelic markers were obtained by the cnethod of 
the eighth embodiment. 

An eighteenth embotfiment of the present invention is a method al using a drug comprising obtaining a nucleic 
acid sample from an intfividual. detemitntng the ictentitY of the polymorphic base of one or more biallelic markers obtained 

30 by the method of the first embodknent which is assoctatad with a positive response to treatment with said drug or one 
or mora biallelic markers obtained by the method of the first embodiment which is associeted with a ncgatWe response 
to treatmem virtth said ttrug, and administering said drug to said individual if said nucleic acid sample contains one or 
more btaOeOc markers associated nvith a posHhre responsa to treatment with said drug or if said nucleic acid sample 
tacks one or more biatleSc markers associated whh a negathre response to said drug. In one aspect of this eighteenth 

35 embadtment the dctermimftfl step comprises deiermming the identrty of the polymorphic base of one or more biallelic 
markws obtained by the method of the aspect of the sixth embodimait wheran the trait b drug response which is 
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associated with a posrthre response to uaatment with said drug or one or more bialletic markers obtainvd by the aspect 
of the sixth cmbodrment wherein the trait is druQ response which is associated Mith a negative rcsiuinse to treatment 
with said drug* tn another aspect of ibis eighteenth embodiment, the dotcrmfning step compHsos determining the 
identitY of the polymorphic base of one or more bialleiic markers obtained by the aspect of the eighth embodimtint 
5 whacrft the trait is drug response which is associated with a positive response to treatment with said drug or one or 

more bialleltc markers obtained by tlie method of the aspect of the eighth eoibodiniunt wlicroin the trait h drug fesponsc 
which is associated with a negative response to treatment with sard drug. 

A nineteenth embodiment of the present invention is a method of sctcctiriQ an individual for irrclusion in a 
clinica) trial of a drug comprising obtaining a nucleic acid sample from an individeat determining the identity of ihe 

to polymorphic base of one or more bialloric markers obtained by the method of tho first embodiment which is associated 
with a positive response to treainient with said drug or one or more biatlclic markers assodated wttli a negative 
response lo treatment with said drug in s&'d nucleic add sample, and including said individual in said cOnical trial if said 
nucleic acid sample contains one or more bialleiic markers obtained by the method of the first embodiment which is 
associated with a positive response to treatment with said drug or if said nucleic acid sample lacks one or more bialleiic 

15 markers assodated with a negative response to said drug, in one aspect of this nineteenth embodiment, the duiermimng 
step comprises determining the identitY of tlie polymorphic base of one or more bialleiic markers obtained by the aspect 
of the sixth embodiment wherein the Uait is drug response which is associated with a posith^e response to treatment 
with said drug or one or moie biattelie markers obtained by the aspect of the sirth embodiment wlicrcin the trait is drug 
rcspons which is associated with a negathro response to treatment with said drug. In another aspect of this nineteenth 

20 embodimont the determining step comprises determining the identity of the polymorphic base of one or more bialleiic 

markers olitained by the aspect of thc.eighth embodarent wherein the trail is druQ response vjhich is associated with a 
positive response to treatment with said druo or one or more bialleiic markerj obtained by the aspect of the eighth 
embodiment wherein the trait is drtig response which is associated with a nagativB response to treatment with saiil 
drug. 

A twentieth embodiment of the present invention is a method of identifying a gene associated with a 
detectable trait comprising the ateps of detennining the frequency of each allele of one or more bialleiic markers 
obttined by the method of tho first embodinent in individuals having said detectable trait and individuals lacking said 
detectable traft rdetitifywg one or more aReles of one or more bialleiic markers having a statistically signrficant 
assodation with said detectable trait and 'tdentifying a gine in linkage disequilibrium with said one or more alleles. 

30 In one aspect of this twentieth embodenent, the method further comprises identifying a mutation tn tho gene which is 
associated with said detectable trait. In another aspect of llws twentieth embodiment, the detectabta trait is selected 
from the group conststina of disease, drug response, drug efficacy, and drug toiicity. 

A twenty-flrtt embodiment of the present invention is a method of identifymg a gene associated with a 
detectable trait comprisino selecting a gene aospected of betng assodated with a detectable trih and identifying 

35 one or more biaDalic markera obtained by the method of the first embodiment wtthtn the genomic region harboring said 
gene which are associated with s^ detectable trai. In one aspect of this twenty-f int embodiment, the detectable trait 
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is selected from the group consistino of disease, drug response, drug efficacy, and drvg toxicity. In another aspect of 
this tweniy*firsi embodiment* the idcntifving step comprises dcterminma the frequencies of said one or more bialldic 
markers in individuals who express said dotectabte trait and individuals who do not express said detectable trait and 

identifying ons or more faialiclic moritcrs which arc statisticeUy associated with the expression of said 
detectable trait 

A twenty-second embodiment of the present invention is a method of identifyino a haplotype associated with 
a trait compiising the steps of obtaining nucleic acid samples frum trait positive and treit negative individuals, 

conducting an amplification roaction on said nucleic acid samples using amplilication primers capable of 
generating amplification products containing the polymorphic basus of a plurality of biallclic markers, contacting one or 
more arrays according to liie tenth embodintfini with sard amplification products, dctctmlning the identiiies of the 
polymorphic bases of said amplification products, and identifying a haplotype having a statistically significant 
association with said trait 

A twenty-third embodiment of the present invention is o method of identifying a haplotype associated with a 
trait comprising the steps of obtaining nucleic acid samples from trail positive and trait negative individuals, conducting 
amplification reactions on said nucleic acid samples using amplification primers capable of generating amiinfication 
products containing the polymorphic bases of a plurality of biallclic markers, contacting one or more arrays according to 
the foorteenth embodiment with said ampGrtcatton products, conducting microsoquencing reactions on said 
amptificatiofl products using microscqocncing primers on said anays, thereby generating elongated microsequencing 
primers comprising the polymorphic bases of said amplification products, determining the identities of said polymorphic 
bases* and identifying a haplotype having a statistically significant association with said trait 

A twenty-fourth embodiment of the present invention is a method of identifying a haplotype associated with a 
trait coftrprising the steps of obtaining nucleic add samples from trait positive and trait negative individuals, conducting 
amplification reactions on said nticleic acid samples losng ampliUcation primers which arc capable of generating 
ampTification products contaimao tha polymorphic bases of a plurality of biallclic markers, conducting microsequencing 
reactions on said nucMc acid samples, thereby generating mictosequencing products containing the polymorphic bases 
of one or cnere bi^c markers at Ihcir 3* ends, said polymorphic bases being detectably labeled, contacting one or more 
arrays according to the sixteenth entoiTiment with said microsequencing products such that said microsequencing 
products spedftcaRy hybrbfize to said nucleic adds tomplementary to said microsequencing primers. detemtining 
the identities ef the pofyroorphic bases of said microsequendng products, and identifying a haplotype having a 
startisticafly significant association with said trait 

A twenty*fiftb embodbnent of the present invention is a method of identifying a haplotype associated with a 
trait comprising the steps of obtaining nudetc acid samples from trait positive and trait ncgathre individuals, contacting 
one or more anrays according to the twelfth embodiment with said nucleic acid sample, conducting on amplif icaiten 
reactton on said nuddc add samples using amprrikation primers on said array which are capable of generating 
ampfificaticn products containing tba polymorphic bases of a plurality of biaflelic mark^, determining the identities of 
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thfl polymorphic bases of said amplification producis. and idenufvino a haplotype having a statistically significant 
association with said trait 

A twcnty suttb embodirocnt o1 the pwcnt tnvcmion is a method of dcfcmiimng whctiicr an individuaJ is at risk 
of dfvciotiing Alzheimer's discaso or whether the indtviduat suffers from Atihctmcr's diseasa as 9 result of possessing 

S tha Apo E €4 Site A aUete compristng obtairong a nucleic add sample from said individual, and detarmining tha identity 
of the polymorphic base in ont or more of the sequences selected from the grotfp consisting of SEQ 10 Nos. 301 -305 and 
SECl ID Nos. 307-31 1 or the suiucnccs comptemcniary thereto in said nudeic acid samffle. In one aspect of this twenty- 
sixth embodiment, the mtjthod further comprises determining whether said nucie« acid sample contains the snquence ol 
SEQ 10 No. 3DG or the sequence complementary ihereto. In anotlia' aspect of fhis twenty-sixth embodiment, the step of 

0 deteimining U»e iOentity of the polymorphic bases in one or more of the sequences selected from the group consisting of 
SEQ to Nos. 301-305 and SEQ 10 Nos. 307-311 oi the sequences complementary thereto comprises determining 
v«lietlier said nucleic acid sample contains the sequence of SEQ 10 WO. 311 (the T allele of marker 99-365/344) or the 
sequence complementary thereto, (n another version of the preceding espect, tlic furtliur comprises detcrminino wlieiher 
said nucleic add sample contains the sequence of SEQ 10 No. 30G or the sequence complementary thereto. 

5 A twenty-seventh cmboiTwicnt of the present invention is an isolated nucleic acid comprisiiig a sequence 

selected from the group consisting of SEQ tO No. 301. SEQ ID No. 307, the sequences complementary thereto, and 
fragments comprising at least B consecutive nucfeotides. hduding the polymorphic nucleotide, thereof, 

A twenty-eighth embodnnenT of the present invention is an isolated nucleic acid eomprising a sequence 
selected from the group consisting of SEQ 10 No. 302 . SEQ 10 No. 308, the sequences complementary thereto, and 

0 fragments comprisiflg at least 8 consecutive nudeotidcs thereof. 

A twenty-ftimh embodiment of the present invention is an isolated nucleic acid comprisino a sequence selected 
from the group consistino of SEQ 10 No. 30X SEO ID No. 309. the sequences complememary thereto, and fragments 
comprising at least 8 consecutive nudeottdes, including the polymorphic nucleotide, thereof. 

A thirtieth embodmient of the present invention is in isolated nucleic acid comprising a sequence selected from 

5 the group consisting of SEQ ID No. 304, SEQ 10 No. 310 , the sequences complememary thereto, and fragments 
cmnpri$ing at least 8 eonsecirtive nucleotides, including the polymorphic nucleotide, thereof. 

A thirty first emfaodiraem of the present invention is an isolated nucleic add compiisinQ a sequence selected 
from the oroup constetiiig of SEQ ID No. 305, SEQ 10 No. 311, the sequences complememary thereto, and fragments 
comprisino at least 8 consecutive nucleotides, maudina the polymorphic nucleotide, thereof. 

M) A thirty second embodiment of the present invention is an isolated nudeic add comprising a sequence selected 

from the group consistino of SEQ ID Nos, 313-317. SEQ ID Nos. 319-323, and fragments comprising at least 8 
coosecutive nucleotides thereof, 

A thbly third embodim^ of the present invention is iSQlated nudeic add comprising a sequence selected from 
the group consisting of SEQ ID Nos. 325-329, SEQ 10 Nos. 331-335, the sequence complementary thereto, and 

35 fragmcnu compris'mg at least 8 consecutive nucleotides thereof. 
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A thirty fourth embodiment of the present invention is set of nucleic acids comprising at leiist B consccutivo 
nucleotides, including tfie polymorphic nucleotide, of one or more biaQefic markers obtained by the method of the first 
embodiment. 

A thirty fifth embotfiment of the present invention is a set of nucleic acids comprtstno amplification pmors for 
5 generating an amplincation prodott comprising at least 6 consecutive nucico tidies, Includino the polymorphic nucluotido. 
of one or more tiioflcfic markers obtained by the mctbod of the first embodiment 

A thirty sixth embotfinient of the present invention is a set of nucleic acids compnsinQ one or more 
microsaquenciny primers for delennining the identity of the potymorphic bass of one or moit nucleic adds cutiifirisino nt 
least 8 consecutive nucleotides, mcladiftg the polymorphic nucleotide, of one or more biallelic markers obtained by the 
1 0 method of the first embodiment. 

Brief Descftntion of the Drawinos 
Trgura 1 is a cytoQenctfC map of chromosome 21 . 

Rgure 2a shows the results of a computer simulation of the distribution of intcr markcr spacing on a randomly 
t5 distributed set of bialtolic markers indicating the percentage of biallelic markers which will be spaced a given distance 

apart for 1. 2, or 3 markers/DAC in a genomic map (assuming a set of 20,000 niinimally overlapping DACs covering tlic 
genome are evaluated). 

Figure 2b shows the results of a computer simulation of the disuibuiion of intcr»markcr spacing on a randomly 
distributed set of faiaOelic markers indicating the percentage of biallelic markers which will be spaced a given distance 
20 apan for 1. 3. or 6 markcrs/BAC in a genomic map (assuming a set of 20,000 minimally overlapping BACs coverinn the 

genome are evaluated). 

Figure 3 shows; for a scries of hypothetical sample zkes, the p-value significance obtained in association 
studies performed using tndividual markers from the high-density biallelic map, according to vaiious hypotheses rcoarding 
the difference of allelic frequencies between the T + and T- samples- 

Fqjuie 4 b a hypothetol association analysts condnctcd with a map comprising about 3,000 biallelic markers. 

Figure 5 is a hypothetical association analysts conducted with a map comprising about 20,000 biallelic 

markers. 

F^um 6 is s hypothetical association analysis conducted with a map comprising about 60,000 biallelic 
markers. . 

30 Ftgure 7 h a haphnype analysis using biafialic markers in the Apo E region. 

Fjgurs 8 is a stmutated haplotype analysts using tha biallelic markers *m Utc Apo £ region included in the 
haptotype analysis of Figur»7. 

Hgure 9 shows i minimal array of overtapptng dones which was chosan for further studies of biallelic markers 
associated with prostata cancer, the positions of STS roarkcis known to map *m the candidate genomic region along the 
35 ccntig, and tha hications of bialleltc markirs along the BAC contig harboring a genomic region harboring a candidate gene 
associated with prostate cancer which were identified using the methods of the present Invention. 
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Rgure 10 is a rough locaisaUon of a candidate gcnD for prostate cancer which wa$ obtained by determining 
(he freqticoctfts of the tiialldic markBrs of Figure 9 in af foctsd and unaffected poputaiions. 

Figure 11 is a further refinement of the (ocali23tion of the candidate ocnc for prostate cancer usintj additional 
biattclic markers which were not inchjdcd in lha rouQh lucalizstion illustrated in Rgurc 10. 
5 n^irrc 12 is a haplotypa analysis using the biaihific markers in the genomic region of the Qcrm associated wflh 

prostate cancer. 

Figure 13 is a simulated tiaplotypc using the six markers included in haptotypa 5 of Figure 11 

Detailed Descriptinn of the Prefcrnid Fmbndiment 

10 The human haploid ocnome contains an estimated BD^OOD to 100.000 or more genes scattered on a 

3x10* base-long double stranded ONA shared among the 24 chromosomes. Each \mm being is diploid* ie. possesses 
two haploid genomes, one from paternal origin, the other from maternal origin. The sequence of the human genome 
varies among individuals in a population. About 10' sites scanercd along the 3 x to' base pairs of ONA are polymorphic, 
existing in at least two variant forms called alleles. Most of these polymorphic sites arc gcticratcd by single base 

15 substitution mutations and are biaOctic. Less than 10^ polymorphic sites arc due to more complex changes and are very 
often muIli-allcGc i.c exist in more than two afielic forms. At a given polymorphic site, any individual (diploid), can be 
either homozygous (twice the same allelel or heterozygous (two different alleles). A given polymorphism or raic mutation 
can be either neutral (no effect on trait), or functional, responsible for a panicular genetic trait. 

Genetic Mans 

20 The first step towanis the identification of genes associated v^iih a detectable trait, sudi as a disease or any 

other detectable trait, consists in the localization of genomic regions containing trait-causing genes using genetic 
mapping methods. The preferred uaits contpnplatcd within the present invention relate to fields of therapeutic interest; 
in particular embodiments, they will be disease traits andjor drug response traits, reflecting drug efficacy or toxicity. 
Traits can ehher be 'hinary*, e.g. diabetic vs. non diabetic, or "quantitative", B.g. elevated blood pressure. Individuals 

25 affected by a quantitathre trait can be dassifted according to an appropriate scale of trait values, eg. blood pressure 
ranges. Each trait value range can then be analyzed as a binary trait. Patients showing a trait value within one such 
range win t>e studtod hi comparison with patients showing a trait value outside of this range, tn such a case, genetic 
analysis methods will be appCed to sufapopulations of cndivtduals showing trait values within defmed ranges. 

Genetic mapping involves the analysis of the segregation of polymorphic loci in trait 

30 positive and trait negative populations. Polymorphic loci constitute a small fraction of the human 
genome (less than l%\ compared to the vast majority of human genomic DNA which is identical in 
sequence among the chromosomes of different individuals. Among all existing human polymorphic 
loci, genetic markers can be defined as genome-derived polynucleotides which are sufficiently 
polymorphic to allow a reasonable probability that a randomly selected person will be heterozygous, 

35 and thus informative for genetic analysis by methods such as linkage analysis or association studies. 
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A genetic map consists of a collection of potymorphtc markers which have been positioned on tli8 human 
chtomosomes, Genettc maps may be combined with physical maps, coitsctions of ordered ovedapping fragments of 
genomic ON A whose arrangoment along the human chromosomos is known. The optimal gsnatic mnp should possess 
the following characteristics: 

5 • the density of the genetic marltcrs scactved along tha gcnonie should bu sufficient to allow the identiiicntion and 

localization of any trait rclated potymorpliism. 

* each marifcr should have an adequate level of heterozygosity, so as to be informative in a large percentage of different 
mcioseSi 

• all markers should be easily typed on a routine basis, at a reasonable expense, and in a reasonable amount o( time. 
1 0 * the entire set of markers per chromosome should bo ordered in a higttly reliable fashion. 

However* while the above maps are optenaL it will be appreciated that the maps of the present invention may 
be used in the tlie individual marker and haplotypc association analyses described below without the necessity of 
determining the otdcr of faiallelic markers derived from a single DAC with respect to one another. 

Genetic Mans Based on RFlPs or VNTRs 

^5 TiiQ analysis of ONA polymorphisms has relied on the following typos of polymorphisms. The first generation 

of genetic markers were restriction fragment length polymorphisms (RFCPsI, single nucleotide polymorphisms which 
occur at restriction sites, thereby modifying the cleavage pattern of the corresponding restriction enzyme Though the 
original methods used to type RFlPs were material-, effort- and timfrconsuming. today these murkm can easily be 
typed by PCFi-bascd technologies. Since they are biatlefic markers (they present only two alleles, the restriction site 

20 iKing either ^tesant or absent), their maxirmim hetDrozygosiiy is 0.5. The theoretical number of ^RPs distributed along 

the entire hi«tian genome is more than 10 , which leads to a potential average inter-marker distance of 30 kiJobases, 
However, in reality the number of evenly istributcd RFlFs which occur at a sufficient frequency in the population to 
make ihem useful for tracking of genetic polymorphisms is very limited. 

The second genention of genetic markers was VNTRs (Variable Number of Tandem Repeats), which can hn 

25 categorized as either minisateflites or microsateflitcs. Minisatellites are taridemly repeated ONA sequences present in 

units of S'SO repeats whkh are distributed along regions of the human chromosomes ranging from OJ to 20 kiiobasas in 
length. Since they presant many possible alleles, their polymorphic informative content is very high. Minisatellites are 
scond by performing Southern blots to (deadfy tha number of tandem repeats present in a nucleic acid sample from the 
individuat being tested. However, there are only 10^ potential VNTRs that can be typed by Southern blotting. 

30 Microsateirttes (also called stmpte tandem repeat polymorphisms, or s'miple setjuence length polymorphisms) 

constitute the most developed category of genetic markers. They include small arrays of tandem repeats of simple 
sequences (di-trhtetra- nucleotiile repeats) which exhibit a high degree of length polymorphism and thus a high level of 
informattveness. ^tfy mora than 5,000 microsatellicss easily typed by PCR*derived rechnologies, have been ordered 
along the human genome tOib et al.. Nature 380:152 (1996), the disclosure of which is incorporated herein by 

35 ref^ence). 
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A number of these avatlablo microsatelirtes were used to consuuct intcgratad physical and genetic maps 
containing less than 5,000 markers. For axomple, CEPH (Chumakov et aU Nstun 271: 1 75-298 (1995) and Cohen et at.. 
^3tm 366: 690-701 (1993) , the disclosures of which arc incorporated herein by rafercncG}, and Whitehead Institute 
ond Gdndthon (Hudson et ol., 1995), constructed genetic and physical maps covering 75% to 95% of tlie human ocnnmc, 
5 based on 2500 to 5000 micrasatol&tB markers. 

However, the number of easily typed inforniativc markers in thusc maps was too small for the average 
distance between inf ormathre markci s to fulfill tlic above-risted ruiiuirerronts for Qcnutic maps. 

OiaHelic Markws 

Otailclic markers ere gcnomc ikyived polynuduutides which eihibtl btallelic polymorphism. As used herein, tliu 

10 terra btalldic marticr means e biellclic single nucleotide polymorphism. As used herein, the term polymorphism may 
include a single base substitution, tnstrtioa or deletion. By dcfmition, the lowest allele frequency of a bialfclic 
polymorphism is 1% (sequence variants which show allele frequencies below 1% are called rare mutations). There are 
potentially more than 10 btallelic markers which can easily be typed by routine automated techniques, such as 
sequence- or hybriifizaiioft-based techniques, out of which lO'* are sufficiently informaiive for mapping purposes. 

15 tfowevcr« a biallelic marker will show a sufncient degree of informativencss for use in genetic mapping only if the 

frequency of its loss frequent altete is not less than about 10% (i.e. a hetcrozygositY rate of at least 0.18) (the 
heterozygosity rate for a bialeOc marker is 2 P, (l-PJ , where P, is the frequency of allele a). Preferably, titc frcquoncy 
of the less frequent allele of the biallc&c markers in the present maps is at least 20% (i.e, a heterwygosity rate of a! 
feast 0^). More preferably, the frequency of the less frequem allele of ihe biallelic markers in the present maps is at 

20 least 30% (le. its bcteroiygosity rate is highor than about 0,42). 

Initial attempts to construct genetic maps based on non RFLP biallelic markers have focused on identifying 
biallelic markers lying within sequence tagged sites (STS), pieces of genomic DNA having a known sequence and 
averaging about 250 bases m length. l\^ore than 30,QDO STSs have been identified and ordered along the genome 
(Hudson et aL, Science 270:1945-1954 (1985); Schuler et aU Science 274:540.546 (1996), ihe disclosures of which 

25 are incorporated herein by iiferencel. for aampfe, the Whitehead Institute and Genethon's integrated map contains 
15,086 STSs. 

These sequence tagged sites can be screened to identify polymorphisms, preferably Single Nucleotide 
Polymorphisms (SNPs), more preferably non RaP biancGc markers thercia Generally polymorphisms arc identified by 
determming the sequence of the STSs in 5 to 1 0 individuals. 

2^ Wany et aL (Cold Spring barbor laboraiory: AbstRcts of papers presseitted on genome Mapping and 

fa^i/W£w^7p.17 (May 14-18, 19971 the disclosure of which is incorporated herein by reference) recently announced the 
idoitification and mapping of 750 Single Nucleotide Polymorphisms issued from the sequencing of 1Z000 STSs from 
tfie WhiteheadfMIT map, in eight onrelated mdivkfuals. The map was assembled using a~ high throughput system based 
OA the utnizatioii of DAU chip technolOQy avaiiable from Affymcirii (Chee et aL, Sd&Ke 274:610-614 (1996), the 

35 disclosure of whith is incorporated heran by ref er«nce)« 
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However* according to eipGrimental data and statistical cakulations, (oss than one out of 10 of ail STSs 
mapped today will contain an informativo Sinote rJucleotidc Polymorphism. This Is primarily due to the short length of 
eiisting STSs {usually less than 250 tip). If one assumes to' informative SNPs spraad along tho human genome, there 
would on avcra^a be one marker of interest every 3X1 0^/1 0^ i.c. every 3.000 lip. The probability that one such marker 
is present on a 250 bp stretch is thus less than 1/10. 

Wtttlo it could produce a high density map* the STS opproadi based on currently existing markers docs not put 
any systcmatjc effort into making sure that the markers obtained are optimally distributed throughout the entire 
genome. Instead, polymorphisms are (tmfted to thoso hications for yvhich STSs are available. 

The even distribution of markers along the cluomosomes is critical to the future sucatss of genetic analyses. 
In particular « high density map having appropriately spaced markers is essential for conducting association studies on 
sporadic cases, aiming at identifying genes responsible for dctcctabia uaits such as these which are described heluw. 

As will he furlhor explained bdow. genetic studies have mostly rdicd in the past on a statistical approach 
called linkage analysis, which took advantage of microsatellitc marfcers to study their inheritance pattern within families 
from which a sufficient number of individuals presented the studied tiait Because of intrinsic limitations of linkage 
analysis, which will be further derailed below, and because these studies necessitate the rccruhment of adequate lamiiy 
pedigrees, they are not well suited to the genetic analysis of all traits, particularly those for which only sporadic cases 
are available (&g. drug response traits), or those which have a low pcneuance within the studied population. 

Assodatioo stuifies offer an alternative to linkage dna^ysh. Combined with the use of a high density map of 
appropriately spaced, sufficiently infomnative markers, association studies, uicluding linkage disequdtbrium-bascd 
genome wide association studies.will enable Ukj idcntiHcaiion of most genes involved in complex traits. 

The present mvention relates to a method for generating a high density linkage discquilibrtum-based genetic 
map of the human genome wtuch will allow the identification ol sufficiently informative markers spaced at intervals 
which permit their use in identifyiijg genes responsible for detectable traits using genome-wide association studies and 
linkage disequHbrlum mapping. 

Constfuctinn of a Phvsicai Mao 
The first step in constructing a high density genetic map of faiailelic markers ts the construction of a physical 
map. Physical maps consist of ordered, overlapping cloned fragments of genomic DMA covering a portion of the genome, 
preferably covering one or all chromosomes. Obtaining a physical map of the genome mtails constructing and ordering a 
genomic DMA Gbrary, 

Physical mapping in comples genomis such as the human genome (3.000 Megabases) reqieres the construction 
of DNA libraries containing large inserts |cn the order of 0.1 to 1 Megabase). It is crucial that such bbrarics be easy to 
construct, screen end manipulate, and that the DNA inserts be stable and relatively free of chioiertsm. 

Yeast artificial chromosomes (YACs; Surke at aL Sckms 23B:B0Ml2 11907), the disclosure of which is 
incofporeted herein by nsference) have provided an mvaluabla tool in the analysis of complex genomes since their cloning 
capactty is eitiBmely l»gh fin the Mb range). YAC Hbraries contakimg targe DNA inserts (up to 2 Mb) have been used to 
generate STS-content maps of individual chromosomes or of the entire human geaome (Chumakov at aL (1995), sifp/B; 
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Hudson ct al. (1995), stfprr, Cohen et at.. Nsfm 366: 696701 ttSQ3; Chumokov ct aU Nstm 359:380-387 (1992); 
G8nvniU et al. Hatun 377:293-319 (1995); OoQgctt c( aU Nsime 377:335-365 (1S35): the disclosures of wliich are 
incorpof ated herein by refercnccl 

The prcsiint genetic maps may be constructed using currently available YAC genomic libraries such as the 
5 CEPH human YAC tibrary as a starting material (Cftumakov et al f1895h suprB). Altcrnalivcly, one may cnnstruct a 
YAC genumic library as described in Cbumakov et aL, 1995, the disclosure of which is incorporated herein fay reference, 
or as described below. 

Once a YAC gettomic library has been obtained the gcnofnic DMA fragments therein are ordered. Ordering may 
be pOfformcd directly on the gimaniic ONA in the YAC library. Huwever, direct ordering of YAC inserts is not pieferred 

10 because YAC libfarias often eihibit a high rate of chimerism (40 to 50% of YAC clones contain fragments from more 
than one genomic region), often scrffer from clonal instability within iheir genonuc ONA inserts, and require tedious 
procedures to manipulate and isolate the insert ONA. Instead, it is prcfciablc to conduct the mopping and sequencing 
procedures required for ordering the genomic ONA in a system which enables tlw! stable cloning of large inserts while 
being easy to manrpufate using standard molecular biology techniques. 

Accordingly, it is preferable to clone the genomic ONA into bacicriai single copy piasmids, for example BACs 
(Bacterial Artificial ChromosomasL rather than into YACs. Baclciial artificial chromosomes arc well suited for use in 
ordcfing genomic ONA fragments, BACs provide a low rata of chimerism and Iragment rearrangement together with 
relative case of insert hoialioa Thus SAC libraries are well suited to inlcgrale genetic, STS and cytogenetic 
information while providing direct access to stabte, rcadily-seqwcnceeble genomic ONA. An example of bacteiiat artificial 

20 chromosome is the BAC cloning system of Shizuya et oU wliicii is capable of stably propaoating and maintaining 
relatively large genomic DNA fragments (up to 300 kb long) as single-copy plasmiUs in ExoU (Shizuya ei aU P(qc, NstL 
Acad, Sd USA 89:8794-8797 11992), the disclosure of which is incorporated herein by rBference). 

Eiample 1 describes the construction of a BAC Ubranr containing human genomic ONA. It wifl be appreciated 
that the source of the gemmuc ONA, the emymos used to digest the ONA, the vectors into which the genomic ONA ts 

25 inserted, end the size of the ONA inseru which are cloned bto said vectors need not be identicel to those described in 
Example 1 below* Rather, the genomic DNA may be obtained from any appropriate source, may be digested with any 
appropriate mynie, and may be cloned into any suitable vector. Insert size may vary within any range compatible with 
the cloning system chosen and wtth the intended purpose of the iibrary being constructed. Typically, using BAC vectors 
tD constnitt DNA Ubraries ctverbtg the entire human genome, insert size may vary between 50kb and 300 kb, preferably 

30 IQOkbandZOOkh. 

Examptel 
Constniction of a BAC lawarv 
Three different hunan genomic ONA libraries were produced by cloning partially digested DNA from a human 
tymphoblastotd M fine (derived from tndividuaf N« 8445. CEPH famiCes) into the pBdoBACll vector {Km et al., 
35 Geaomtcs 34-.213-218 1199B), the tfisctosure of which is incorporated herein by reference). One fibrary was produced 
using a BamHI partial digestion of the genomic OKA from the lymphoblsstoid celt line and contains 110,000 clones 
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having an averaQe insert size of 150 kb (corresponding to 5 human haploid genome equivalents}. Another library was 
prepared from a Hlndllt partial fligcst am) coitesponds to 3 human genome oquivolcnts with an average insert size of 
ISOkb. A third Itbniry was prepared from a Ndcl partial digest and corresponds to 4 human genome equivalents wiili an 
average mscrt size of 1 SOkb. 

5 AUcrnattvBlv, tlic Qcnomic ONA may be inscrtcit into OAC vectors which possess bath a high copy number 

origin of replication, whidi fadlitates the isolation of (he vector DMA. and a low copy number oritjin of replication. 
Cloning of a genomic ONA ouert into the hioh copy number oiigin of repiicatiun utactivates the origin such that clones 
containing a genomic insert rcpTicata at low copy number. The low copy number of clones haWng a genomic insert 
therein permits the inserts to be stably maintained. In addition selection procedures may be designed which enable law 
ID copy number pbsmids (i.e. vectors having genomic inserts therein) to be selected. Such vectors and selection procedures 
arc described in the U.S. Patent Appfication entitled •High Tfirotighput DMA SaquencinD Vector* (GENSET.015A. Serial 
No. Q9/Q58,746L the disclosure of which is incorporated harsin by reference. 

It will be appreciated tliat the present metliods may be practiced using BAG vectors other than those of 
Shifflya et al (1992, st^nj'), or derived from those, or vectors other than BAG vectors which possess the abnvc- 
15 described characteristics. 

To construct a physical map of the genome from gcnomtc ONA Gbraries, the library clones have to bo ordered 
oiong the human chromosomes. In a prefened embodiment, a minima! subset of the ordered clones wili then be chosen 
that completely covers the entire genome. 

For example the genomic ONA in the inserts of the above described BAG vectors are ordered esing STS markers whose 
20 positions relathre to one another and locations along the genome are known using procedures such as those described 
herein. The STS markers used to order the BAG inserts may he tlie STS markers contained in the integrated maps 
described above. Allemathfdy. the STSs may be STSs which arc not contained in any of the physical maps described 
above. In another embodiment the STSs may be a combination of STSs included in the physical maps d^zcubfid above 
and STSs which are not included in the integrated maps described above. 
25 The BAG vectors are screened with STSs until there is at least one positive BAG clone per STS. Pref arably, a 

mtnenatty overiappino set of 10.000 to 30,000 BACs having genomic insens spanning the entire horoan genome are 
id&itififid. More preferably, a mininiatty overlapping set ol 10,000 to 30,000 BACs having genomic inserts ol about 100- 
3Q0kb in length spanning the entire human genome are identified. In a preferred embodiment, a mmtmally overlapping set 
of 10,000 to 30,000 BACs having genomic inserts of about 100-150 kfa in Ngth spanning the entire human genome is 
30 identtTted. fn a highly preferred embodknent a mrnhnally overlapping set of 15,C00 to 25,000 BACs having genomic 
inserts of about 100-200 kfa in lenglli spanning the enttro human genome is identified. Alternatively, a smaller number of 
BACs spatming a sat of chromasmnes, a smgla chromosome, a particular subchromosomal region, or any other desired 
portion of the genome may be orderei The BACs may be screened for the presence of STSs as descnlied in Example 2 
below. 
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Exampie 2 

Dyder'mn of a BAG Library: SHcenino Cinncs with ST5i 
The BAC library is scraaned with a set of PCR-typcalita STS$ to iilcntifv clones containino the STSs. To 
f actlitate PCR screening of several thousand clones, for example 200,000 clones, pools of cioncs are prepared. 
5 Three-dimensional poets of the BAC liirartcs ate prepared as described in Chumakov ct aL and are scrccm*d for 

the abiTity to generate an amplification fraoment in ompiification reactions conducted using primers derived from the 
ordered STSs. (CItumakov et aL (1995), stiprol A QAC Icbrary typically contains 200,000 BAC cioncs. Since the average 
size of each insert b 100-300 kb, the overall size of snch a library is equivalant to llic siie of at least about 7 human 
Qcnon^es. This library is stored as an array of individual clones in 518 384 well plates. It can be divided into 74 primaiy 
10 pools (7 plates each). Each primary pool can thon be divided otto 48 xubpools prepared by using a ttuce-dimensional 

peoiing system based on the ptate« row and columrT address of cod) clone (more paniculinly, 7 subpools consisttnQ of all 
clones residint] in a given microtitcr plate: 16 sobpoots consisting of all clones in a given row; 24 subpools consisting of 
all clones tn a given column). 

Amptification reactions ore conducted on (he pooled BAC clones using primers specific for the STSs. For 
15 example, the three dimensional pools may be screened with 45,000 STSs whose positions rdalive to one another and 
locations along the genome are known. Preferably, the three dimensional pools are screened with about 30,000 STSs 
whose positions relative to one another and locations along the genome arc known. In a highly preferred embodiment, 
the three dimensional pools ere screened with about 20,000 STSs whose positions relative to one another and locations 
along the genome are known 

20 AmpFtfication products resulting from the amplification reactions arc detected by conventional agarose gel 

electrophoresis combined with automatic image caphiting and processing. PCH screening for a STS involves three 
steps; (1) identifying the positive primary pools; (2) lor each positive primary pool, identifying the positive plate, row and 
cflhann 'subpools' to obtain the address of the positive clone; (3) directly confirming the PCR assay on the identified 
clone. PCR assays are performed with primers specifically deftning the STS. 

25 Saeening b conducted as follows. First BAC DNA contaming the genomic mserts is prepared as follows. 

Bacteria contemtnQ the BACs are grown overnight at 37''C in I20pt of li containing chloramphenicol (12 juqimll DMA 
is extracted by the following protwob 

Centrifugo 10 min at A'^G and 2000 rpm 

Eliminate supernatant and resuspend pellet in 120 /;! TE 10-2 (Tns HC1 10 mM, EOTA 2 mM) 
30 Centrif oge 1 0 nnn at 4^*0 end 2000 rpm 

EfiminatB supernatant and incubate penct with 20 iA lyzozyme 1 mg/ml during 1 5 min at room temperature 
Add 20pl protemase K lOOjvg/ml and incubate 15 mm at 60*C 
Add 8 //I DNAsa 2U|;/t and ncobets 1 hr at room temperature 
Add 100 //t T£ 10*2 and k^ at -BO^'C 



35 
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PCH assdys are performed using the following protocol: 

Final volume 15/^1 
BACONA 1.7nQUil 
MgClj 2mM 
5 dNTP(cach) 200 

primer (each) 2.0 ag(/il 

Ampfi Taq Gold DNA polymerase 0.05 umt///l 

PCR buffer (lOx - 0.1 M TiisilCI pH6.3 0.5M KCI 1i 



10 Tim amplification is pcrlortncd on a Genius II ibermocycler. After ficating at SS^'C for 10 niin, 40 cycles are performed. 

Each cydc comprises: 30 sec at 98%, SI^C for ) mm, and 30 sec at 72*'C. For final eloAgation. 10 mm al 72^C end 

the amplification. PCR products arc analyzed on 1 % aoarosc yel with 0.1 mg/ml ethidtum bromide. 

Alternatively, a YAC (Yeast Artificial Chromosome) library can be used. The very large insert si^e. of the order 

of 1 megabase, is the main advantage of the YAC libraries. The library can typically include about 33«000 YAC clones as 
15 described in Chumakov et aL (1995, sup/^l The YAC screening protocol may be the same as the one used for BAC 

screening. 

The known QJiia of the STSs is then used to align the BAC inserts in an ordered arroy (contig) spanning the 
whole human genome. If necessary new STSs to be tested can be generated by sequencing the ends of selected BAC 
mseru. Subchromosomal localization of the BACs can be established andfor verified by fluorescence in situ hybridization 

20 (FlSli), performed on metaphasic chromosomes as described by Cherif et al. 1990 and in Example 8 below. BAC insaa 

size may be determined by Pulsed Field Gel Electrophoresis after digestion with the restriction enzyme NotL 

Ftnally, a minimally overlapping set of BAC clones, with known insert size and subchromosomal localinn, 
covering the emire genome, a set of chromosomes* a single chromosome, a particular subchromosomal region, or any 
other desired ponion of the genome is selected from the DNA Rbrary. For example, the BAC dones may cover at least 

25 lOOkb of contiguous geAomic DNA. at least 250kb of contiguous genomic DNA« at least 500kb of conttguous genomic 
ONA, at least ZMb of conttguoos genomic DMA* at least 5Mb of contiguous gefiomic DMA, at least 10Mb of contiguous 
genomic ONA, or at least 20Mb of contiguous genomic DNA. 

Identification of bisllelic mgrfcers 

30 In order to generate potymorph'isms having the adequate informative content to be used as bialletic markers for 

genetic mapping, the sequences of random genomic fragments from an appropriate number of unrelated mdividuals are 
compared Genomic sequences to be screened for biaMic markers may be getierated by partially sequencing BAC 
inserts, pref&ably by sequencing the ends of BAC subclones. Sequmng the ends of an adequate number of BAC 
subclones derived from a minimally overlapping array of BACs such as those described above will aHow the generation of 

35 tNaSeBc markers spanabig the entire getome, a set of chromosomes, a single chromosome, a particular subchromosomal 
region, or any other desved portion of the genome with an optimized inter<marker spacing. 
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Thv5, portions of the SACs in tho selected ordered array are then subcloned and sequenced using, for example, 
the procedures described below. 

EKample_3 
Sirtidonino of BACs 

5 The cells ofataM (com three Titers overnight culturo of each BAC clone are treated by alkaline lysis using 

conventional tcctmiQUOs to obtain the BAC ONA containioQ the genomic DMA iitscrts. After ccnlrilugation of Ihu BAC 
DNA in a cesium chloride gradient, ca GO|/u of OAC DMA arc purified. S-ID/zu uf BAC DNA arc sonicated usinfl three 
distinct conditions, to obtain f raomcnts within a desired size rongc. The obtained DNA fragments ere end-repaired tn a 
50 )L/I volume with two units of Vent polymerase for 2Q min at 70*^0. in t)ic (iresenca of the four Ucotytriphosphatcs 

10 (lOO/iM). The resuiting blunt-ended fragntems ore separated by atectrophorcsis on preparative fow mclring point 1% 

agarose gels (60 Volts for 3 hours). The fragments lying within a demed tin range, sudi as 600 to 6.000 bp, are 
excised from the gcf and treated whh agarase. After chloroform crtraction and dialysis on Microcon 100 columns, UNA 
in solution is adluued to a 100 ng/pl conccntratioo. A ligation to a linearised, dephosphorylatcd. blunt ended plasmid 
cloning vector is performed overnight by adding 100 ng of BAC fragmented DNA to 20 ng of pBtucscript It Sk (-•-) vector 

15 ONA linearhed by enzymatic digestion, and treating with allcalinc phosphatase. The ligation reaction is performed in a 

1 0//I final volume m the presence of 40 unitsi^l T4 DNA figose (Epicentre). Tlie tigated products arc clcctropofaicd into 
the appropriate cells (ElectroMAX £c0/r'OH1OB cells). IPTG and X gal are added to the ccU mixture, which is then 
spread on tho surface of an ampiciirn-contaimng agar plate. After overnight incubation at ST^C. cocombinant (white) 
colonies are randomly picked and arrayed in 96 wellmicroplatcs for storage and sequencing. 

^0 Ahematively, OAC subdoning may be performed using vectors wtitcli possess both a high copy number origin 

of replication* which facilitates tho isolation of the vector DNA, and a low copy number origin of replication. Cloning of 
a genomic DNA fragment into the high copy nifnlier origin of rapOcation inactivates the origin such that clones 
containing a genomic insert replicate at low copy number. The low copy number of clones having a genomic insert 
therem permits the iruerts to be stably maintained. In addition, selection procedures may be designed which enable low 

25 copy number plasmnls (U. vectors having genomic inserts therein) to be setectcd. In a preferred embodiment. BAC 
subdoning wiQ be perfomud m vectors baviag the ebove described features and moreover eoabiing high throughput 
sequencing of long fragments of genennc DMA. Such high throughput high quaTity sequencing may be obtained after 
generating successhre deletions wrthm the 9tdictoned fragments to be sequenced, using transposition-based or enzymatic 
systems. Such vectors are descnbed in the U^* Patent Application emitled 'High Throughput DNA Sequencing Vector 

30 {GENSET.01 5A, Serial No. Q9n)58746|, the disdosura of which is incorporated herein by reference. 

It will be tppreciated that other subclonmg methods familiar to those skilled in the art may also be employed. 
The resulting suhdones ere then partially sequenced using, for example, the procedures described below. 
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Example 4 
Partial spqucncino of RAC sufacione$ 
The genomic ONA inserts in the subclones, such as ttie BAC subclones prepared above, are amptificd by 
conducting PCR reactions on the ovemight bacterial cultiues, using primers complementary to vector sequences flanking 
5 the inscrttons. 

The sequences of the mserl eitremitius (on averaoc 500 bases at each end. obtained under routine sequencing 
conditions) are determined by fluorescent automated scquencinQ on AOl 377 sequencers, using A6) Fiism ONA 
SoquBnciitQ Analysis software. Fotlowtny gel image analysis and ONA sequence extraction, sequence ifjta are 
automatically processed with adequate softwere to assess sequence quality. A proprietary busc calter. automaticiliy 
10 flags suspect peaks, taking into account the shape of (he peaks, the intcrpcak resolution, and the nnisc level. The 
propriotary basc-cafler abo performs an automatic trimmmg. Any stretch of 25 or fewer bases having mora than A suspect 
peaks is usually considered uwanaUc and is discarded. 

The sequenced regions of the subclones, such as the BAC subclones prepared above, are then analyzed in 
order to identify bialteltc markers lying tbcrtiitL The frequency at which biallolic markers will be detected in the 
15 screening process varies with the average level of heterozygosity For example, if btafleltc markers having an 

• average heterozygosity rate of greater than 0.42 arc dusiret they will occur every 2.5 to 3 kb on average. Tlicreforc, 
on average, six 500 bp-genomic fragments have to be screened in order to derive 1 biallcOc marker having an adequate 
informative coment 

As a preferred alternative to sequencing the ends of an adequate number of OAC subclones, the above 
20 mentioned high throughput deletion-based sequencing vectors, wbich allow the generation at a high quality sequence 

information covering fragments of ca. Bkb, may be used. Having sequence fragments longer than 2.5 or 3kb enhances 
tho chances of identifying faiatlefic markers therein. Methods of constructing and sequencing a nested set of deletions 
are disclosed in the U.S. Patent Application entitled 'High Throughput DNA Sequencing Vector* (GENSET.015A, Serial 
No. OdJ0S8.746L the disclosure of which is tncorporatcd herein by reference. 
25 To identify bUllelic markers using partial sequence information derived from subclone ends, 

such as the ends of the BAC subclones prepared above, pairs of primers, each one specifically 
defining a 500 bp amplification fragment, are designed using the above mentioned partial sequences. 
The primers used for the genomic amplification of fragments derived from the subclones, such as 
the BAC subclones prepared above, may be designed using the OSP sofbvare (Hillier L. ond Green 
30 P., Methods AppU 1:124*8 (1991), the disclosure of which is incorporated hercio by reference). The 

GC contcitt of the amplificatioa primen preferably ranges between 10 and 75 %, more preferably 
between 35 and 60 %, and most preferably between 40 and 55 %. The length of amplification 
primers can range from 10 to 100 nucleotides, preferably from 10 to 50. 10 to 30 or more preferably 
10 to 20 nucleotides. Shorter primers tend to lack specificity for a target nucleic acid sequence and 
35 generally require cooler temperatures to form sufficiently stable hybrid complexes with the 
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tempiatc. Longer primers are expensive Co produce and can sometimes sclf-hybridizc to form hairpin 
structures. 

All primers may contain, upstream of tlie specific target bases, a common otioaiiuclootide tail that survcs as a 
sequencing primer. Those skilled in the art arc famitiar with primer extensions which can be used for these purposes. 
5 To identify bialtelic markers, the sequences corresponding to ttie paitial sequences tlctenntnctl above arc 

determined and compared in a plurality of individuals. Tte population used to identify biallelic markers having an 
adequate informative content prefefably consists of ca. 100 unrelated individuals from a hctcraoeneous population. 

First, ONA is citracted from the peripheral venous blood of cxh donor using methods sucli as thoso described 
in Example 5. 

10 Examples 

Eyrraction of DMA 

30 ml of blood are taken from the Individuals in the presence of EOTA. Cefts (pcllctj arc coltcctcd after 
centrifugation for 10 minutes at 2Q00 rpm. Red cetb are lyscd by a lysis solution (50 ml final volume : 10 niM Tris 
pH7.6; 5 mM MgCtj; 10 mM NaCfl. The solution is ccntrifuged (10 minutes, 2000 q>m) as many times as necessary to 
1 5 eliminate the residual rcti cells present in the supernatant, after resuspension of the pallet in the lysis solution. 

The pellet of white cells is lysed ovomioht at 42*'C with 3.7 ml of lysis solution composed of: 

• 3 ml TE 1 0-2 (TrwHC1 10 mM, EOTA 2 mM) t NaCI 0.4 M 
200 pi SOS 10% 

* 500 fA K protcitiaso (2 mg K-protcinasc in TE 10-2 / NaCI 0.4 M). 

20 For the exuection of proteins, 1 ml saturated NaCI (6M) (1f3.5 v/v) is added. After vigorous agitatinn, the 

solution is ccntrifuged for 20 minutes at 1 0000 rpm. 

For the precipitation off ONA, 2 to 3 volumes of 100% othanol are added to the previous supernatant, and tlie solution is 
ccntrifuged for 30 minutes at 20D0 rpm. The ONA solution is rinsed three limes with 70% ctiianol to eliminate salts, 
and centrifeged for 20 minutes at 2000 rpm. The pallet is dried at 37'C, and resuspendcd in 1 ml TE 101 or 1 ml 
25 water. The ONA conc«ntratioa U evakiated by measuring the 00 at 280 nm (1 unit 00 - 50 //g/ml DNA). 

To evakiate tie presence of proteim in the DNA solution, the 00 260 f 00 2B0 ratio is determined. Only DNA 
preparations fovng a 00 260 f 00 260 ratio between 1.6 and 2 are used in the subsequent steps described below. 

Once genome DNA irom every indivlduai in the given population has been extracted, it is preferred that a 
fraction of each ONA sample is separated, after which a pool of ONA is constituted by assemblino equivalent DNA 
30 amounts of the separated fractions into t single one. 

Secondl the ONA fkbtatned from peripheral blood as desoibed above is amplified using the above mentioned 
ampfification prints. 

Example 6 provides procedures that may be used in the amplification reactions, and the detoction of 
polymorphisms wtthin the obtained areptons. 
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Eimnnle 6 

Amplification of DNA ff nm Pefiphcnt Blood 
and idBntilicatinn of Biallnlic Mar ksn 
The amplification of each sequence is performed on pooled ONA samples obtDincd as in Example 5 above, usioy 
5 pen {Polymerase Chwi Reacvon) os follows: 



• final volume 25 /il 

• genomic ONA 2 ng///t 

.d^fTP(cach) 200 //M 

10 • primer (each) IS n\}f;A 

' AmpH Tail Gold ONA po)ymcrasc (Perkin) 0.05 unitl/;! 



• PCR buffer (lOX-Q.! M Trts HCI pH 0.5 M KCl) IX. 

The synthesis of primers is pcrfonncd following ibc phosphoramtditc method, on a 
GENSET UFPS 24,1 synthesizer. 
15 To reduce the expense of preparing ampUfication primers for use in the above procedures, short primers msy be 

used. While primers and probes having between IS and 20 (or more) nucleotides are usuady higlily specific to a given 
nudeic acid sequence, it may be inconvenient and expensive to synthesize a relatively long oOgonucicotide fur each 
analysis. In order to at least partially circumvent this problem, it is often possible to use smallei but still relatively 
specific otioonudeotides that are shorter n fanptb to create a mDnaffeable library. For example, a library of 
20 ofigonucleotidcs comprising about 8 to 10 nucleotides is conceivable antf lias already been used for sequencing of a 

40,000 bp cosrrud DNA (Sludior, Pmc. Natl. Acad. Sd USA 86|18):6917-6921 119891 the disclosute of which is 
Dicorporated herein by reference). 

Another potential way to obtain specific primers and probes with a small libraiy of oligonudGOtidcs is to 
genarate longer, more specific prnncrs and probes from combinations of shorter, less specific oligonucleotidGs. Libraries 
25 of shorter ongonudeotides, each one being from about five to eight nucleotides in length, have eiready been used 
(lOeleczawa et aU Sdence 258:1787-1791 11992); KoUei et aL, Proc NotL Acad, ScL USA 90:42414245 (1993); 
Kaczorowski and Szyb^ Afls/Bmciiem. 221:127*135 <19g4), tha disclosures of whidi are mcorporated herein by 
reference). Sintabla probes and ptvncrs of appropriate tength can therefore be designed tlvough the association of two 
or three shorter oKgonudtotides to constitute modular primers. The association between primers can be either covalent 
30 resulting from the activity o! DMA T4 Ggase or non^ovalint through base-slacking energy. 

The amplification b perf onnad on a Perkia Efaner 9600 Thermocyder or MJ Research PTC200 vsrith heating lid. 
After heating at 95®C for 10 mhiutes, 40 cydos are pcrfomted. Each cyde comprises: 3D sec at 95*0, 1 minute at 
S4''C. and 30 sec at 72*'C. For final donpfion, 10 minutes at 72''C ends the ampfificatioiu 

The quantities of the empCrication products obt^d are deteraimed on 9B-wetl microtHer plates, using a 
35 f luorimeter and Picogrten as intercalating agent (Molecular Probes). 
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The sequences of the ampnncation products are dBtcrmincd using automated dideoxy terminator sequencing 
reactions with a dyef rimer cyda sequencing protocol. The products of the sequencino reactions are run on sequencing 
gcfs and the sequences ere detGrmined usinp gel image analysis. 

The sequence data arc evaluated using software designed to detect (lie presence of btandrc sites among the 
S pooled amptiricd fragments. The potymurphism search is based on the presence of superimposed peaks in the 
electrophoresis patter/) rgsulting from different bases occurring at the same position. Because each didcoiy terminator 
is tabded with a different fiooresccat molecule^ tk* two pealcs corresponding to a biallelic site present distinct colors 
concsponding to two different nucleotides at the same position on the sequence. The software cvaJuatcs the tntcnstty 
ratio between the two peaks and the intensity ratio between a given peak and surrounding peaks of the same culor. 
10 fiowever* itie presence of two peaks can be an artifact duo to background noise. To exclude such on artilact. 

the two DMA strands are sequenced and a comparison between titc peaks it carried otrt. In order to be registered as a 
polymorphic sequence, the polymorphism has to be detected on both strands. 

The above procedure permit) those amplification products which contain biallelic markers to be identified. 
The detection limii for the frequency of biallenc polymorplusms detected by sequencing pools of 100 
15 individuals is about 10S for the minor aftele. as verified by sequencing pools of known aifeJic Uequcntkt, However, 
more than 90% of tlie biallelic polymorphisms detected by the pooling method have a frequency for the minor allele 
higher than 25%. Therefore, the btatlelic markers selected by this method have a frequency of at least 10% for the minor 
allele and 90% or less for the major allele, prcferabiy at least 20% for the minor aUcIc and B0% or less for the major 
ancle, more preferably at least 30% for the minor allele and 70% or less for the major aHclc. thus a hetcrozyonsity rate 
20 highcf than 0,1 8. preferably higher than 0.32. more preferably higher than 0.42. 

In an'initiat stutly to determine the frequency of biallelic markers in the human genome that can be obtained 
using the above methods the following results were obtained. 300 different amplicons derived from 100 individuals, and 
covering a total of 150 kb obtained from different genomic regions, were sequenccl A total of 54 biallelic 
polymorphisms were identifted^ indicating that there is one biallelic polymorphism with a heterozygosity rate higher than 
25 0.1B Ifrequency of the minor aHele Mahcr than 10%), preferably higher than 0.38 (frequency of the minor allele higher 

than 25%L wery 2J to 3 kb. Gcvan that the human genome is ebout 3.10* kb long, this indicates that, out of the lO' 
btalleOc markers presoit on the human genome, approxrmaiely 10* have adequate heterozygosity rates for genetic 
mappmg purposes. 

Using the procedvres of Examples 1*6, sets containing increaang numbers of bialleTic markers may be 
30 constructed. For example, the procedures of Examples 1-6 are used to identify 1 to about SO biallelic markers. In some 
embodiments, the procedins of Examples V6 are used to identify about 5Q to about 200 bialleGc markers. In other 
embodiinents, the procedures of Examples 1G are used to identify about 200 to about 500 biallelic markers. In some 
embodiments, the procedurts of Examples 1-6 are used to identify about 1,000 biallelic markers. In other embodiments, 
the procedures of Examples 1-6 are used to identify about 3.000 biaOeiic markers. In further embodiments, the 
35 procedures of Examples 1-6 are used to identify about 5,000 biaQcIic markers, bi another embodnnent, the procedures 
of Examples 1-8 ere used to idottify about 10,000 biallefic markers. In still another embodiment the procedures ol 
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Eiamplas 1-6 are used to identify about 20,000 biallclic markers. In stilt another embodiment, the procedures of 
Examples 1-6 are used to idemify about 40,000 biollelic markers. In still another embodiment, the procedures of 
Eiamples 1*6 arc used to idemify about 60,000 biallclic markers. In still another embodiment, tlic procedures of 
Examples 1*G are used to identify about 80,000 bialtelic markers. In a still another embodiment, the piocGduras of 
5 Examples 1-6 are used to identify mora than 100,000 bialletic markers. In a furlhcr embodiment, the proctidures of 

Examples 1-6 are oscd to identify more than 120,000 biallclic markers. 

As discussed above, the ordered nudoic acids, sucli as the insans in SAC dunes, which contain ttiu liialielic 
marlccrs of the present invention may span a portion of the genome. For eiampte, the ordered nucleic acids may span at 
least lOOkb of contiguous genomic ONA, at least 25akb of contiouous ocnornic OHA, at least SOOicti of contiguous 

10 genomic ONA, at least 2Mb of contiQuous genomic QUA, at least 5Mb ol contiyuous genomic ONA, at least IDMb of 
contiguous genomic ONA, or at least 20Mb of coatiguous genomic ONA. 

In addition, groups of biaUelic markers located in proiimity to one another along the genome may be identified 
within these portions of the genome for use in haplotyping analyses as described below. The biallclic markers included 
in each of these groups may be located within a genomic region spanning Jess than Ub, from 1 to 5Icb. from 5 to lOkb, 

15 from 10 to 25kb, from 25 to SOkb, from 50 to 150kb, from 150 to 250kb, Irom 250 to 50Qkb, from SOOkb to If^lb, or 
more than IMh. It wOl be appreciated that the ordered ONA fragments containing these groups of biaUelic markers need not 
completehr cover the genonuc regions of these lengths but may instead be incompicte contigs having one or more gaps 
therein. As discussed in further detail below, biaOcnc markers may be used in single maker and haplotypc association 
analyses regardless of the comptetcncss of the corresponding physical contig harboring them. 

Using the procedines above* 653 biaUelic markers, each having two alleles, were identified using sequences 
obtained from BACs which had been localized on the genome in some cases, markers were identified using pooled BACs 
and thereafter reassigned to mdividual BACs using STS screening procedures suri» as those described in Examples 2 ard 
7. The sequences of 50 of these 653 biaQefic markers irs provided in the accompanying Sequence listing as SEQ ID 
Nos. 1-50 and 5M00 (with SEO 10 Nos. 1-50 being one allele of these 50 biallclic markers and SEO ID Nos. 5M0O 

25 being the other aOele of these 50 biaRefic markers}. Although the sequences of SEQ ID Nos. V50 and 5M00 wUl be 
ttsed as exaniplanr markers thrDughout the present appTicatiori, it will ba appreciated that the bialtelic markers used in 
the maps of tht present invention are not limited to these particular markers, nor are they Timited to having the exact 
flanking sequences surrounding the polymerphk bases which are enumerated in SEQ ID Nos. 1-50 and 5V100 Rather, 
it wOl be appreciated that tba flanking sequences sunounding the polymorphic bases of SEQ ID Nos. 1-50 and 51-100 

30 may be lengthened or shortened to any extent compatible with their intended use and the present mvention spGcifically 
contemplates such sequences. The sequences of these 653 bialtelic markers, including the sequences of SEQ ID Nos. V 
50 and SMOO may be used to construct the maps of the present kweotion as well as in the gene identification and 
diagnostic techniques described herein. It win be appreciated that the btallenc markers ref errid to herein may be of any 
length compatible with their intended use provided that the markers include the polymorphic base, and the present 

35 invention spedfically contemplates such sequences. 
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Ordering of biaUcfic marters 

BiaKcdc markers can be ordered to determino their positions along chromosomes, preferably subchromosomut 
rcQions, most preferably alono the above described minimally overlapping ordered BAC arrays, as follows. 

Tha positions of the biallctic markers along chromosomes may be dctcnnincd using a variety of mcthodolooics. 
5 tn one approach, radiation hybrid mapping i$ used. Radiation hybrid inil) mapping is a somatic cell Qcnctic approach that 

can be used tor hrgb rcsolutton mappiiio of the human ucnamB. tn tliis aiiproach, cell Imcs containing one or mure human 
chromosomas are icthaly irradiated, breaking each chromusomo into fragments whoso size depuids on the radiation dose. 
These fragments are rescued by fusion with cuttunctf rodent cods, yitidtng subclones containaig dilfcrent portions uf the 
human genome. This technique is described by Bonhom ct al. {Genomics 4:509-517. 1989) and Cai ct a!.. [Science 

10 250:245-250, 1990), the entire contents of which are hereby incorporated by reference. Ihe random and independent 

natufB of the subdonos petmits ef rctent mappintj of any human Qenomc marker. Homan DNA isutated from a pani'l of 80- 
100 ccK fines provides a ntapping reagent for ordering bialblic marken. tn this approach, the frcqiicncy of breakage 
between markers is used to meastffc ifislanca, alowing construction of Tine resolution maps as has been done for ESTs 
(Schular et al., SdmcB 274:540-540, 1996, hereby incorporated by reference]. 

15 nil mapping has been used to gensrate a high-resolution whote genome radiation hybrid map of human 

cliromosome 17(j22-q2SJ across the oc«cs for growth hormone (6H| and thymidine kinase (TK) (Fustcr et al, Genomics 
33:185-191 19961 the region surrounding Ittt Gorlin syndrome gene (Obermayr et aU Eur, J, Hum. CcneL 4:242-245, 
19961 BO loci covering the entire short arm of chromosome 12 (Raeymaekers el al.. Gcfiam/ct 29:170-178. 1995), the 
region of human chromosome 22 containing thi neurofibromatosis type 2 locus fFrazer et al., Genomics 14:574-504, 1992) 

20 and 13 loci on the long arm of chromosome 5 fWarrington at at Genomics 1 1:701-708, 1991). 

Mternathrdy, PGR based techniques and human-rodent somatic cct) hybrids may be used to determine the 
posrtiotts of the biaflcBc markers on the chrtjmosomcs. In such approaches, oligonucleotide prwnei pairs which are capable of 
generating amplification pnoducts containing the polymorphic bases of the bi^eFic markers are dcsipel Piefcrably, the 
ofigonuck!otide primers an 1B-23 bp in length and are designed for PGR amplificatwa The creation of PGR primers from 

25 known set^iefices is welt known to those with skill in the art. For a review of PGR technotogy see Eriicfi HA, PGR 
TtiGhnQloiir Prmdolcs ind Aptri lcitiomfor DMA AmoKfieatlon. 199Z W.H. Freeman and Co., New York. 

The pruners art used n polymerase chain rsactions (PGR) to amp% templates from total human genomic ONA. 
PGR conditiitfis are as folkn«£ 60 ng of genontic DNA is used as a template for PGR with 80 ng of each Qfqjonudcotide 
primer, 0.8 unit of Taq potymerasa, and 1 fxCu of a '^P-labelcd dcoxycytkfine triphosphate. The PGR is pcrfafmed in a 

30 imcroplate tharroocycter fTechne) under the foDewing conditioos: 30 cydcs of 94"^ 1.4 min; SS^C, 2 min: and 72*C, 2 min; 
with a final eitension at 72*'C for 10 mat The ampfir«d products are analyzed on a 6% polyacrylamidc sequDftcing gel and 
viswalzed by •utoridiography. H the length ol the resultiiv PGR product is identical to the length expected for an 
amplificatiQo product contaimng the polytnorphic base of the biaHelic marker, than tha PGR teaction is repeated with DNA 
templates from iwo panels of human-tedam somatic tcB hybrids, BIOS PCflable ONA (BIOS Corporation) and NIGMS 

as Kuman'Rodent Somatic Ce8 Hybrid Mapping Panel Nuhber 1 (NIGMS, Camden, N 
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PCR u used to saeen a series of somatic eel) hybrid cefi lines containing defined sets of human chromosomes for 
the presence of a Qivcn bialcltc marker. DNA is isolated from the somatic hybrids and used as starting templates for PCR 
reactions using the primer pairs from the biaPefic mallear. Only those somatic cell hybrids with chromosomes containing the 
human sequence concsponding to ttic biaMc marker w2( yield an nmpfiftcd fragment The binltclrc markers are assigncff to 
5 a chromosome by an^sis of the su{jregation pallem of PCR products from the somatic hybrid DNA templates. Tliu single 

human chromosome present in all cell hybrids that tpve rise to an ampcrted fragment is the ctuomosoma conlaininy that 
bialfciic marker. For a review of techniques and analysis ol results from somatic cell gene mapping ciperiments. (Siie 
Ledbcttcr et al.. Genomics 6:475^61 (1990).) 

Example 7 desaibcs a preferred method for positioning of biallclic matkcrs on clones, such as 6AC clones, 
1 0 obtained from genomic ONA fibrarics. 

Example 7 

Screemnn BAC libraries with btellelic markers 
Amplification primers enabling the specific ampGtication of DNA fragmems carrying the biallcfic markers fmctudiny 
the 653 biallelic markers obtained above (which include the sequences of SEQ 10 Nos 1-50 and 5M00) may be used to 
t5 screen clones rn any genomic DMA CbraiY. preferably the BAC libraries described above for the presence of the bialldic 

markers. 

Pairs of primers were designed which allowed the amplification of fragments carrying the 653 hialtcGc markers 
obtained above. The amilfirication primers may be used to screen doncs in a genomic DMA library for the presence of the 
653 bialtefic markers. For exantpic, pairs of amplificatnn primers of SEQ ID Nos. 101150 and 151-200 may be used to 

20 ampfify fragments which tndude the polymoiphic bases of the faialtetic markers of SEQ ID Nos. 1 -50 and 5 M 00. 

It will be app/edated that ampSfication primers for tho biaHalic morksrs may be any sequences which allow the 
specific ampliHcation of any ONA fragment carrying tic mariccrs and may be designed using techniques familiar to those 
skilled in the art The ampOfication primers may be oligonucleotides of 8, 10, 15, 20 or more bases in length which 
enable the amplification of any fragment carrymg the polymorphic site in the markers. The polymorphic base may be in 

25 the ccmer of the ampGftcation product or, altctnathrely, it may be located off-center. For example, in some 
embodiments, the ampfificatton product produced using these primers may be at least 100 bases in length [lc. 50 
nucleotides on each side of the polymorphic base in ampOfication products in which the polymorphic base is ceniraHy 
located). In other embodiments, the ampHTtcation product produced using these primers may be at least 500 bases in 
length fue. 250 nucfeotides on each side of the polymorphic base m ampTif icaiion products in which the pofyroorphic base 

30 is centrally located). \a stiQ further ombodtments, the amplification product produced us'mg these primers may be at 
least 1000 bases tn length Cte. 500 nucleotides on each side of the polymorphic base in amplification products in which 
the polymorphfc base is centrally located). Ampfifrcation primers such as those described ebovc are included within the 
scope of the prosant imrentton. 

The localization of biallelic markers on BAC ckmes is performed essenttaOy as described m Eiampte 2. 

35 The BAC clones to be screened v e tlistributed *m three dimensioAal pools as described in Examine 1 
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Mplification reactions m comhictcd on the pooled BAC clones using primers specific for the blaltclic markers 
to idcntiFv OAC clones which contain the biailolie markers, using proccdurGs essentially sinular to those described tn 
Example 2. 

Amplification products resulting from the ampfification roactfons are detected by conventional agarose gel 
5 elcctrophor(»is combined with automatic image capturina and processing. PCH screening lor a biallolic marker involves 

titrce steps: (1) identffymg the positive prtroary pools; (2) for each positive prhnary pools, identifying the positive plate, 
row and cotumn 'subpoots' to obtain the address of the positive clone; (3) directly confirming the PCH assay on the 
idcntinud clone. ?CH assays are performed with primers defining the biallclic marker. 

Saeening is conduciod as follows. First OAC DNA ts isolated as follows. Oactcrta containing the genomic 
10 insans arc grown overnight at 37''C ki 12Q//I of LB containing chtoramphcmcol (12 MQ/mf). ONA is eitracted by the 

following protocol: 

CcntfifuQO to min at 4^€ and 2000 rpm 

EHnunatc supernatant and resuspend pellet in 120 pi IE 102 [Tris llCt 10 mM, EDTA 2 mM| 
Centrifuge 10 min at A^C and 2000 rpm 
^5 Eliminati supernatant and inctibatc pellet with 20^ lyzozyme 1 mg/ml during 15 min at room temperature 

Add IQ^ proteinase K lOQM/ml and incubate 15 min at BO'C 
Add Bp\ DNAsc 2UI;4 and incubate 1 hr at room tempcrotorc 
Add 100 //I TE 10-2 and keep at -BO^'C 



20 PCR assays are psrlormcd using the foIlowinQ protacul: 

Final volume 15^1 

BACOMA 17ngW 

MgClz 2mM 

dNTP(iBch) 200 /iM 

25 primer (each) 2.9ng/;/l 

Ampfi Taq Goid DNA potymrase 0.05 m\lp\ 
PCH buffer llOi -0,1 MTrisHapH8.30JMKCl h 



The ampfificttion is performed on a Genius ft thermocyder. After heating at OS^'C for 10 rni^ 40 cycles arc 
perf ocmei Each cydc comprisaj: 30 sec at 95*0, 54"C for 1 min, and 30 sec at 72**C. For final elongation, 1 0 min at 
72''C end the ampHrtcatiorL PCB products are analyzed on 1% agarose get with 0.1 mg/ml etMdtum bromide. 

Using such procedorcsi a number of BAC clones carrying selected faiaOclic markers can be isolated. The 
poation of these BAC cbmes on the human genome caa be defuied by perf omw^ STS screening as described in Examplo 
2. Preferably, to decrease the number of STSs to be tested, each BAC can be localized on chromosomal or 
subchfomosomal regions fay procedures such as those descrftnd in Examples 8 and 9 below. This localization wilt allow 
the selection of a subset of STSs corresponding to the identified chromosomal or subchnunosomal region. Testing each 
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BAC with 5uch a subset of STSs and taking account of the position and order of the STSs along the genoma will allow a 
rcftrcd positiomn) of the corrospontfmo biaUslic marker along the Qcnome. 

In otticr embo(&nimts, if the ONA library used to isolate BAC inserts or any type of genomic DNA fragments 
harboring the selected biaOcfic markers atmady constitute a physical map of tjic genome or any portion thereof, using the 
5 known order of tlie ONA fragments wOl allow tlic order of the bialtclic markers to be cstabHshed 

As discussed above h will bo appreciated that markers carried by the same fraipncnt of genomic DNA« such as 
the insert in a BAC dona, need not ncccssenly be ordwcd with respect to one another witbni the genomic (rnumcnt to 
condua single point or haplotypc association onatysis. However, in oilici embodiments of Uie present maps, the order of 
biaOefic markers carried by the same fragment of gcnuniic DNA may be detcnnined. 
10 The positions of the biallelic markers used to construct tha maps of tlie present invention, indutiing the 6S3 

biatletic markers obtained above, may be assigned to subchrrnnosomal locations usino Fhiorcsccnce b Situ Kybridi/^iion 
(FISH) (Cherif ct aU Proa Nath Acad, ScL 87:6639-6643 (1990K the (fisclosure of which rs incorporated herein by 
reference). RSH analysis Is described iii Example 6 below. 

15 Einmnle 8 

Assignment of Biaflefic Markers to Subcttrnrnnsnmal Reofons 
Mataphase chromosomes ore prepared from phytohcnuiQglutinin {P(lA)>stimulated blood cefl donors. PHA- 
stimulated lymphocytes from heallliy mates arc cultured for 72 h in RPMI-1G40 medium. For synchronization, mctliolrciate 
(10 ^M) is added for 17 h, foUowcd by addition of Sbromodeoxyuridine (SOudR, 0.1 mM) for 6 h. Colccmid (1 ^g/mD is 
20 added for the last 15 min before harvesting the ceils. Cells are collected, washed in RPMI, incubated with a hypotonic 
sohjtion of KCt (75 mM) at 37^ for 15 min and fixed in tliee changes of mcthanoluicctic acid (3:1). The cell suspension is 
dropped onto a glass slide end air-ilried. 

BAC clones carrying the biallelic markers used to constjuct the maps of the present invention {including the 653 
biafleTc markers obtained aboveto) can be Isolated as described above. Those BACs or portions thereof, including fragments 
25 carTYmg satd biaOeCc markers, obtained for example from ampfifxation reactions using pairs of amprirtcation primers as 
descrikod above, can be used as probes to be hybtidlzed with metaphasic chromosomes. It wiB be appreciated that the 
hybrkizatoa probes to be used in the contemplated method may be generated using alternative methods well known to 
those skilled in the art Hybridimion protes may have any length suitable for this intended purpose. 

Probes are then labeled with biottn-IG dUTP by nick translation according to the manufacturer's instructions 
30 (Bethasda Research Laboratories, Bethesda, MO), purified using a Sephadex 6-50 column (Pharmacia, Upssab, Sweden) and 
precipitatel Just prior to hyfariifization, the DNA ptUet is dissolved in hybridizition buffer (50% f oimamide. 2 X SSC, 1 0% 
dextnn sulfate, 1 mghnl sonicated sdmon spemi DNA, pH 7) and the probe is denatured at 7D^C for 5-1 0 min. 

Slides kept at -20^C are treated for 1 h at ZTZ with RNase A (1 00 ^g/mO. rinsed three times in 2 X SSC and 
dehydrated in an etband series. Chromosome preparations are denatured in 70% fermamida. 2 X SSC for 2 min at 7D°C. 
35 then dehydrated at 4*a The sides are treated with protanase K (10 MflMOO ml In 20 mM Tris-HCl, 2 mM CaClj) at 37*C 
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for 8 min and dehydrated. The faybridsatiQn mbcturfl contahuno the probs is placed on the slide, covered with a covcrslip, 
scaled with rubber cement and incubatod overnight in a humid chamber at 37*^6. After hytrndization and posMiybtidization 
washes* tliu biotinylatcd probe is detected by avt'din-FlTC and ampfified with additional by&rs of b(otiny)a1cd goat anti-avidin 
and avidin-FTTC. For chromosomal tocoTizattoa fluorescent R-bamis are obtained as previously described {Chcrif ct al.,(1990} 
5 si^m,). The slides are observed under a LEICA fhjorescencc miaoscope (OMRXA). Chromosmnos are counters! aincd with 
propidiuni iodide and (he fluoroscent signal of the probe appears as two symmetrical ycQow-oiecn spots on both cUomatids 
of the fbjorescent R-band chromosome (red). Thus, a particular tiiaffclic marker may be locaG/cd to a particular cytogenetic 
R-band on a givan chromosome. 

Tlic above procedure was used to connim the sii)chromosomal location of 95% of the BAC clones barboring the 

10 653 markers obtained above. In partrcutar, the 50 markers of SEQ 10 Nos. 1*50 and 51-100 were assigned to 

subchromosomal regions of chromosome 21. Simple identification numtiers were attributed to each BAC from which the 
markers are derived. Figure 1 is a cytogenetic map of chromosome 21 ncficating t)tc subchromosomal regions therein. Table 
1 Csts the intomal iderttificalian numbor of the locaiixed biaflelic markers, the internal identiHcation number of the BACs from 
which tiic markers were derivcdl the size of the BAC eisert the average inrermarker distance in the BAC insea and the 

1 5 subchromosomal locations of the biaOefic markers. The sequences of the bcaltzed markers are provided as SEQ tO Mas. 1 -50 
and 51-100 in the accompanying sequence fating. AmpOfication primers lor generating an^plification products containing 
tlie polymorphic bases of these marVers are also provided as SEQ ID Nos. 101-150 and 151-200 in tho accompanying 
sequence (isting. Mtcrosequendng primers for use in detomiining the identrties of the polymorphic bases of these biaRelic 
markers arc provided in the accompanying Sequence listing as SEQ 10 Nos. 201-250 and 251-300. 

2^ Tlic rate at which bialleltc markers may be tuigned to subchromosomal regions may be enhanced through 

automation. For eiample, proba praparaftton may be performed in a microttter plate format using adequate robots. The rate 
at which biaOeTtc markers may be assigned to subchromosomal regions may be enhanced using techniques which permit the 
in si>tf hybridization of multiple probes on a singk; microscope slide, such es those disclosed in Larin ct Nucleic Acids 
Research 22: 3583-3692 11994L Uie ifisdosure of which is incorporated herein by reference. In the largest lest fomiat 

25 descnbed, tfifferent probes wen hybridized sirrndtafteously by applying them directly from a 96-weU microtiter (Ssh which 
was inverted on a glass fdate. Software for image data acquishian and anatysis that is adapted to each opticat system, test 
format, and fhnrescem probe used, can bt derived from the system described in Uchter et A Science 247: 64-69 11 990), 
the Asdosure of which is incorporated heroin by reference. Such software measures the relative distance between the 
center of the flaortscent spot conesponding to the hybridocd proba and the telomeric end of the short arm of the 

30 corrospotufing chromosome, as compared to the total length of the chremosone. The rate at which biaHefic markers are 
assigned to subchromosomal locations may be further enhanced by stmiiltaneousty applying probes labeled with different 
flouorescent tags to each weB of the 96 weS dish. A further benefit of conducting the analysis on one slide is that tt 
fadtates automation, since a micrmcope biviog a moving stage and the capabOity of detecting fluorescent signals in 
ditf erem mettphase chromosomes cmdd provide the coordenatcs of each probe on the metaphase chromosoniBs distributed 
35 onthe96weBdish. 

Example 9 below describes an altemalrae method to position bieltefie markers which allows their assignment to 
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Example 9 

Assignment of RialiBiic Maricws to Human Chromn^omes 

The biallfilic markers used ts construct the maps of the present invention, fncluding the 653 luallelic markers 
obtBtned above Iwhtch indude tho setjucncts of SCQ 10 Nos. V50 and may be ossionad to a human 

chromosome using monosomal nalysis as dcsaibed below. 

Tho chromosomal tocali^ation of a biaBclic marker can be performed through the use of somatic cell hybrid 
panals. For example 24 panels, each panel containing a different human chromosome, may bo used IRusseil et al.. 
SomatCiffMoI, Genot 22:42&431 (1996); Orwinga ut ai., Genomics 16:3tV314 11993), the disclosures of which arc 
incorporated herein by reference). 

The biailclic markers are localized as follows. The ONA of each somatic cell hybrid is extracted and purified. 
Genumic DNA samples from a somatic cell hybrid panel arc prepared as follows. Cells arc lysed ovcrnioht at 42*'C with 
3.7 ml of lysis solution composed of: 

3 ml TE 10-2 (Tris NC1 10 mM, EOTA 2 mM) / NaCI 0.4 M 

200/ylSDS10% 

500 pt K'prote'mue (2 mg K pr oteinase in TE 1 0-2 / NaCI 0.4 M| 

For the extraction of proteins, 1 ml saturated NaCI (6M) (1/3.5 v/v) is added. After viuorous agitation, the 
solution is centrifuged for 20 min at 10,000 rpm. For the precipitation of DNA, 2 to 3 volumes of 100 % ctlianot are 
added to the previous supernatant and the solution is centrifuged for 30 min at 2.000 rpm. The ONA solution is rinsed 
ttirce times with 70 % ethanol to eliminate salts, and centrifuged for 20 min at 2,000 rpm. The pellet ts dried al 37" C, 
and resuspcndcd in 1 ml TE 10-1 or 1 ml water. The ONA concentration is evaluated by measuring the 00 at 260 nm (1 
unit 00 « SO pglml ONA). To determine the presence of proteins tn the ONA solution, the OD^goIOOxbo ^^^io 
determined. Only DNA preparations having a ODjso/OOao ratio between 1.8 and 2 are used in the PGR assay. 

Thenr > PCR assay b pBrforroed on genomic DNA with primers defining the bialletic marker. The PGR assay is 
performed as tiescribed above for BAG screening. The PCA products are analyzed on a 1% agarose gel containing 0.2 
mglml ettudium bromide. 

Tht ordering analyses desaibed above may be conducted to generate an integrated genome wide genetic map 
comprising ahout 20,000 biaficKc markers (1 biailclic marker per BAG if 20,000 BAG inserts are screened). In some 
embodiments, the map inchides one or more of the 853 markers obtained above (which include the sequences of SEQ 10 
Nos. 1 -50 and 5M0O or the sequences complementary thereto). 

tn another embodiment the above procedures are conducted to generate a map comprising about 40,000 
markers (an average of 2 bialletic markers per BAG if 20,000 BAG inserts are screened). In some embotfiments. the map 
includes one or more of the 6S3 markers obtained above [which include the sequences of SEQ ID Nos. 1*50 and 5M00 
or the sequences complementary thereto). 

In a further embodiment preferred embotant, the above procedures are conducted to generate a map 



wo 99/04038 PCT/IB98/0n93 

-32. 

compiising about 60,000 markers ( in average of 3 btallcltc markers per BAC \l 20,000 8AC inserts arc scraenedi. in 
soma emborflments, the map indudcs one or more of (he 653 markers obtained above (which include the sequences of 
SEQ ID Nos. 1-50 and 5MO0 or the sequences complementary thereto). 

In B further embodiment preferred embodiment the ibovc procedures are conducted to generate a map 
comprising about 80,000 markers (an average of 4 biaHefic markers per OAC if 20«000 OAC inserts are screened). In 
some emboifiments, the map includes one or more of (he 653 markers obtained above (which rncfude the scmtoncBS of 
SEQ ID Nos. 1-50 and 5M00 or tho sequences complementary tticrcto). 

In yet anotlicr embodiment the above procedures are conducted to generate a map comprising about 100.000 
markers (an average of 5 biatlciic markers per BAC if 20.000 BAC insens are screened). In some embodiments, the map 
includes one or more of ll» 653 markers obtained above (which include the sequences of SEQ 10 Nos. 1*50 and b1 100 
or the sequences compbHnentary thereto). 

In a further embodiment, tfie above procedures are conducted to generate a map comprising about 120.000 
markers (an average of 6 biallalic markers per BAC if 20.000 BAC inserts arc screened). In some embodiments, the map 
tnchjdis one or more of tlic G53 markers obtained above (which include the sequences of SEQ (D Nos. 1 -50 and 5 1-1 00 
or the sequences complementary thereto. 

Atternativety, maps having tha above-specified average numbers of bialleilc markers per BAC which comprise 
smaller portions of the genome, such as a set of chromosomes* a single chromosome^ a particular subchromosomal 
region, or any other desired portion of the genome, may also be constructed using the procedures provided herein. 

In some embodiments, the biattslic markers in tlie map are separated from one another by an average distance 
of 10-200kb. In further embodiments, the biallelic markers in the map arc separated from one another by an average 
distance of 1 5-150kb. In yet another embodiment the biatlciic markers in the map are separated from one another by an 
average distance of ZO-IOOkb. In ether embodiments, the btallctic markers in the msp are separated from one another 
by an average distance of 100*150kb. In further embodiments, the biallelic markers In the map are separated from one 
another by an average distance of 50-100kb. In yet another embodiment the biallelic markers in (he map are separated 
from one another by an mrage ifistance of 25'50kb. Maps havmg the above*specified intermarker distances which 
comprise smaltar portions of the genome, such as a set of chromosomes, a single chromosome, a particular 
subchromosomaf region, or any other desired portion of the genome, may also be constructed using the procedures 
provided herein. 

Figure Z showing the results of computer simulations of the distribution of inter-marker spacing on a randomly 
distributed set of btallefic markera« indicates the percentage of biaQelic markers which wUI be spaced a given distance 
apart for a given number of markcrs/BAC in the genomic map (assuming 20,000 BACs constituting a minimally 
overalapping array covering the entire genome are evaluated). One hundred iterations were performed for each 
simulation (20,000 marker map, 40,000 marker map, 60,000 marker map. 120.000 marker map). 

As iUustrattd in Figure 2a, 98% of inter-marker distances wiU be lower than ISOkfa provided 60,000 evenly 
35 distributed markers are generated (3 per BAC); 90% of inter-marker distances win be lower than ISQkb provided 40,000 
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evenly distributed marker; are generated (2 per BAG); and 50% of inter-marker distances will be tower than 150kb 
provided 20.000 evanly distributed markers are generated (1 per &AC). 

As illustrated in Figure 2b, 989^ of tnter-markcr distances will be lower than 80kb provided 120,000 evenly 
distributed markers are generated (6 per BACj; 80% of inter markcr distances will be lower than BOkb provided 60,000 
evenly distributed markers are generated (3 per BAG); and 15% of intcr markor distances will be lower than BOkb 
provided 20,000 evenly distrOiuted markers are generated |1 per BAG). 

As already mentioned, high density biallclic marker maps allow association studies to be performed to identify 
genes involved in complex traits. 

Association studies eiamine the frequency of marker alleles in unrelated trait positive (T-»-) individuals 
compared with trait negative (T-) controls* and are generally employed in the detection of polygenic inheritance. 

Association studies as a method of mapping genetic traits rety on the phenomenon of linkage disequilibrium, 
which is described below. 



Linkeoe Discnuilibriiwn 

If two ocncttc loci lie on the same chromosome, then sets of alletos on the same chromosomal segment (called 
hapiotypes) tend to be transmitted as a btock from generation to generation. When not broken up by recombination, 
haplotypes can be tracked not only through pedigrees but also through populations. The resulting phenomenon at the 
population level is that the occurrence of pairs of specific altelas at dillcrcnt loci on the same chromosome is not 
random, and the deviation from random is called Gnkage disequilibrium (ID). 

If a specific allele m a given gene is directly involved in causing a particular trait T, its frequency will be 
statistically increased in a T-e population when compared to the frequency in a T- population. As a consequence of the 
existence of LD, the frequency of all other alleles present in the haplotype carrying the trait-causing allele {TCA} will also 
be increased in T-*- individuals compared to T- individuals. Therefore, association between the trait and any allele in 
Gnkage disequifibriuffl with the trait-causing aOela wiQ suffice to suggest the presence of a trait-related gene In that 
particular allele's retpon, Unkaga tSsequilibrium allows the relative frequencies in and T- populations of a limited 
number of QWittc polymorphisms (specHtcaly blaQelic markers) to be analyzed as an alternative to screening all possible 
functional pelymorphisnu in order to feid trait-causing atletas. 

The present tiwention then also concerns biallelic markers in linkage disequilibrium with the specific biallctlc 
markers described above and which are expected to presem similar diaracteristics in terms of their respective 
association with a given trait In a prefened embodiment, the present invcmion concerns the biaQelic markers that are in 
linkage disei(uafer'ntm with the 653 biaO^ mariiers obtained above (which include the sequences of SEQ 10 Nos. 1-50 
and 51 *1 00 or the sequences complementary thereto). 

LD among a set of biallelic roariten having an adequate heterozygosity rate can be determined by genotyping 
between 50 and 1000 unrelated tndhriduals, preferably between 75 and 200. more preferably around 100. Genotyping a 
bialleGc marker consists of determhung the specific allele carried by an bdhridual at the given polymorphic base of the 
bialtefic marker. Genotyping can be perfomned using similar methods as those described above for the generation of the 
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bialtBlic markers* or using other genotypino methods such as those further described balow. 

LO between any pair of biatlclic marlccrs comprising at least one of the bialletlc markers of the present 
invention (M^fytjl can bo catcutated for every allele combination (Mn^Mji . Mj^Mi;; f^izMiy and M,2,Mj2), according to liic 
Piaua formula : 

AMtt.lWj|- V04 . V (94 + 03) (04 +02) , where : 
04- • - - frequency of genotypes not having allele k at and not having allele J at M, 
03- - + - frequency of Qenotypas not having allele k at M( and having aflde I at M| 
02* ^ • - frequency of genotypes having ilide k at M, and not having allele I at 

LinkoBt disequilibrium (LD) between pairs of biallclic morksrs (Ml Mj) tan also be calculated for every allele 
combination (Mil^Mjl ; l\^i1.Mj2 ; Mi2,Mj1 ; Mi2,Mj2) according to the matimum likelihood estimate (MLEI for delta (the 
composite linkage disequilibrium coclficient), as described by Weir (B.S. Weir, Gencik Data Analysis, (199G), Sinauer 
Ass. Eds, the disclosure of which is incorporated herein by reference). Tliis formula allows linkoQc disequilibrium 
between atlelas to be estimated when only genotype, and not haptotype, data are availalite. This LD composite test 
makes no assumption for random mating *ui the sampled population, and thus seems to be more appropriate than other 
LO tests forgenotypic data. 

The skilled person win readily appreciate that otiicr LO calculation methods can be used wiUiout undue 
eiperimentation. 

Example 10 tlbstrates the measurement of LD between a pubftcty known biallcHc marker, the 'ApoE Site A', 
located wHhin the Atzheinur's related ApoE gene, and other biallelic markas randomly derived from the genomic region 
containing the ApoE gene. 

Example ID 
Measurement of Linkage DtsettuiHbrium 
As originally reported by Strtttmaner et aL and by Saunders at aL in 1993. the Apo E e4 allele is strongly 
associated with both late-oflset famiDal and sporadic Alzheimer's dtssase (AD). (Saunders, A.M. Lancet 342: 71071 1 
(1993) ami Stnttmater, WJ. tt aL Proc NatL Acad. $ci. U.S.A. 90: 1977-1981 (19931 the disclosures of which are 
incorporated herein by reference). The 3 major Isoforms of human Apoiipoprotein £ (8poE2, -£3* and -£4), as identified by 
tsodecuic fdcustng, are coded ftf by 3 allek>s [z 1, 3, and 4). The e 2, e 3, and e 4 isoforms differ in amino acid 
sequertce at 2 sites, residue 112 (called site A) and residue 158 (caRed site B). The ancestral isoform of the protein is 
Apo £3, which at sites A/B contains cystctneiarginine, whDe ApoE2 and -E4 contain cysteinefcysteine and 
argmine/arginine, respectively (Weisgraber. K.R et aU J, Biol. Chem. 256: 9077-8083 11981); Rail SX. et aL, Proc 
NatL Acal ScL U^. A. 79: 46964700 (1 982), the disclosures of which are incorporated herem by rderence). 

Apo E e 4 IS currently con»darad as a mafor susceptibity risk factor for AD development in Individuals of 
different ethnic groups (specially in Caucasians and Japanese compared to Hispanics or African Americans), across all 
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ages between 40 and SO years, and in both men and women, as reported recently in a study performed on 5930 AD 
patients and 8607 controh IFarrer et ^^JAMA 278;1349-1356 (19971 tbe disclosure of which is incorporated herein 
by reference). More speciftcslly, the frequency of a C base coding for arginine 112 it site A is significantly increased in 
AO patients. 

5 Although the incchamstic link between Apo E c 4 and neuronal degeneration characteristic of AD remains to be 

estabtislicd, current hypotheses suggest that tho Apo £ genotype may influence neuronal vulnerability by increasing the 
deposition and/or aggrcgatton of the amyloid beta peptide io the brain or by indirectly reducing energy availability to 
neurons by promoting atherosclerosis. 

Usino thi methods of the present invention, biallclic markers that are in tho vidnity of the Apo E site A were 
10 generated and the association of one of their alk^tcs with Alzheimer's disease was inalyicd. An Apo E public marker 
(stSG94) was used to screen a human genome DAC library as previously described A 6AC, whidi gave a unique FISH 
hybridization signal on chromosomal region 19q1 3.2,3. the chromosomal region harboring the Apo E gene, was selected 
for finding biallalic markers in tinkagc disequilibrium with tlic Apo E gene as follows. 

This BAC contelned an insert of 205 kb that was subcloncd as previously described. Fffty OAC subclones were 
15 randomly selected and sequenced. Twenty five subclone sequences were selected and used to design twenty five pairs 
of PGR primers allowing 500 bp-amplicons to be generated. These PGR primers were then used to ampfify the 
corresponding genomic sequences in a poof of DNA from 100 unrelated individuals (blood donors of French origin) as 
already describel 

Amplification products from pooled DNA were sequenced and analyzed for the presence of biailclic 
20 polymorphisms, as already described. Five ampticons were shown to contain a polymorphic base in the poof of 100 
unrelated individuals, and therefore these polymorphisms were selected as random biallelic markers in the vicinity of the 
Apo E gene. The sequences of both aUelcs of these biallclic markers (99-344/439; 99-355/210; 99-359/306; 99- 
365)344 ; 99-3661274) conespond to SEQ ID Nos: aOV30S and 307-311 (See the accompanying Sequence listing and 
TablslO) . Gorresponding pairs of amplification primers for generating amplicons containing these biallclic markers can 
25 be chosen from those btcd as SEQ 10 Nos: 31 33 1 7 and 31 9-323. 

An additional pair of piimers (SEQ ID Nor 318 and 324) was designed that allows amplincation of the 
genomic fragment carryina the btafieOc polymorphism corresponding to the ApoE marker (99-2452/54; CfT; The C allele 
is designated SEQ ID NO: 306 in the accompanying sequence listing, while the T allele is designated SEQ ID NO: 312 in 
the accompanying Sequence listing; (See also Table 10). publicly known as Apo E site A (Weisgrabcr et al. (19811 
30 si^. Ran et at. (1982). sx^tr?) to be ampQried. 

The five random biallelic markers plus the Apo E site A marker were physically ordered by PGR screening of the 
corresponding ampieons using ail avaHable BACs originaBy selected from the genomic DNA libraries, as previously 
described, using the public Apo E marker $tS694. The implicoo's order derived from tbb BAC screening is as follows: 
(99-344f99-366} • (99*365/99-2452) - 99-359 • 99*355. 
35 where brackets indicate that the exact order of the respective amplicons couldn't be established. 

linkage tfisequiSbrium among the six btattetic markers (five random markers plus the Apo E site A) was 
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detemuned by genotypfng thf sime 100 unrelated individuals from whom the random biallclic markers were identified. 

DrfA samples and amptiftcation products from genomic PGR were obtained in similar conditions as those 
described above for the generation of biaHelic markers, and subjected to automated microscquencing reactions using 
fluorescent ddNTPs (specffic fluorescence for each ddlVTP) and the appropriate microsequencinQ primers having a 3' end 
5 rmmediatBly upstream of the polymorphic base in the biaHelic markers. The sequence of these microsoqucncing primers is 
indicated within the corresponding sequence listings of S£0 10 Nos: 325-330. Once specificaOy extended at tha 3' end 
by a ONA polymerase using the comptemcntary fluorescent didcoxynuclaotide analog (thirmal cydingL the 
micfosequQncing primer was precipitated to remove (he onmcorporated fluorescent ddNTPs. The reaction products wt*ru 
analyzed by electrophoresis on AOI 377 sequencing machines, ftcsults were automatically analyzed by an appropriate 

10 software further described in Exampla 13. 

Linkage disequilibrium ILD) between all pairs of biatlelic markers {Mi, Mj] was calculated for every allolu 
combination (Mil.Mjl ; Mi1,Mi2 ; Mi2,Mj1 ; Mi2,Mj2) according to the maximum likeiihood estimate (ML£) for delta (the 
composite linkage disequilibrium coefftcient). The results of the LD analysis between the Apo £ Site A marker and the 
nvo new biaOclic markers (99-3441439 ; 99 355/219 ; 99-359/306 ; 99 3G5/344 : 99.366/274) are summariicd in Table 

15 Zbilow: 

Table 2 

Markirs dxlOO SEQ ID Noi of tho SEQ ID Mos of the 

20 bialtatic Marko/s amplification Primers 





ApoE SitoA 


306 


318 




39-2452/54 


312 


324 


99-344/439 


1 


301 


313 






307 


319 


99-366/274 


1 


305 


317 






311 


323 


99-365/344 


8 


304 


318 






310 


322 


99-359/306 


2 


303 


315 






309 


321 


99-355/219 


1 


302 


314 






308 


320 



35 



The above ID resulu indicate that among the five bialtatic markers randomly selacud in a region of about 200 
kb contaming the Apo E gene* marker 99-36S/344T ts in relatively suong Imkage disequaibrium with the Apo £ site A 
aDele (99-2452/540. 
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Therefore, since the Apo £ site A allele is associated with Atzhetntcr's disease* one can predict that the T allele 
of mafker 93-365/344 wi)l probably ba found associated with AD. fn order to test this hypothesis, the bialfefic markers 
of SEQ 10 Nos : 301-306 and 307 31 2 were used in association studies as described below. 

225 Alzheimer's disease patients were recruited according to clinical inclusion criteria based on tho MMSE 
test. Tlic 246 control cases inchidod in this study were both ethnicslly- and dQc-matched to the affected coses. Both 
affected and control indrvidaals corresponded to unielotcd cases. The identities of the polymorphic Oases of each of the 
biaQolic markers was determined in each of these individuals using the methods described above. Techniques for 
conducting association studies are furtht^ described Mow. 

The resuhs of this study are summarized in Table 3 below : 



Tabid 3 



MARKER 



ASSOCIATION DATA 



Difference in allele frequency 
between indWiduals with Alzheimer's 
and control individuals 



Corresponding p-vatua 



99-344/439 
99-366/274 
99-3C5I344 
99-2452154 (ApoE StUA) 
99-359/308 
99-355/219 



3.3% 
1.6% 
17.7% 
23.8% 
0.4% 
15% 



9.54 E-02 
2.09 E01 
6 .9 E-10 
3.95 E-21 
9.2 E-01 
2.54 E Ol 



The frequency of the Apo E site A allete h both AD cases and controls was found in agreement with that 
previously reponed (ca. 10% in conuoh and ca. 34% in AO cases, leeding to a 24% difference in ellele frequency), thus 
validating the Apo E e4 assoctatton in tha populations used for this study. 

Moreover, as pn^ed from the LO analysis (Table 2), a significant association of the T allele of marker 99- 
365/344 with AD cases (16% mcrease in the T allele frequency in AD cases compared to controls, p value for this 
difference - 6.9 E<10) was observed. 

The above results indicate that any marker in LD with one given marker associated with a trait will be 
associated wKh the trait. It will be appreciated that, though m this case the ApoE Site A marker is the trait-causing 
allele (TCA) itself, the same condusion could be drawn virvth any other non TCA marker associated with the studied trait. 

These results further indicate that conducting association studies with a set of bialleTtc markers randomly 
Oenerated within a candidate region at a sufficient density (here ebout one bialleCc marker every 40lcb on average). 
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allows the idcntirication of 8t Isast ona marker a»ociated with the trait 

In addition, these results correlati with the physical order of the six biallelic markers contemplated within ihc 
present example (see abovel : marker 99-365/344, which had been found to be the closest in terms of physical distance 
to the ApoE Site A marker, also shows the strongest LD with the Apo E site A marker. 

In order to further refine the relationship between physical distance and linkaQc disequilibrium between biatlelic 
markers, a ca. 450 kb fragment from a genomic region on clvomosoma 8 was fully sequenced. 

LO within co. Z30 pairs of biallcfic markers derived therefrom was measured in a random French population 
and analyzed as a function of the known physical intcr marker spacing. This analysis confumcd that, on avernoc, LD 
between 2 biallclic markers corrdatcs with the physical distance that separates tlicm. It further indicated that LD 
between 2 fatallelic markers tends to decrease when their spacing increases. More panicularly, LO between 2 biullelic 
markers tends to decrease when tlicir inter-marker distance is greater than 50kb, and is further decreased when the 
inter-marker distance is greater than 75kb. It was funher observed that when 2 bialtelic markers were further than 
150kb apart most often no significant LD between them could be evidenced. It will be appreciated that the si2c and 
history of the sample population used to measure LO between markers may influence the distance beyond which LO 
tends not to be dctcctabla^ 

Assuming that LD can be measured between markers spanning regions up to an average of 150kb long, biallelic 
marker maps wiQ allow genom^wide LD mapping, provided they have an average inter-marker distance tower than 
150kb. 

Genome-wide LD mapping aims at identifying, for any TCA being searched, at toast one biallelic marker in LO 
with said TCA. Preferably, in order to enhance the power of LO maps, in some embodiments, the biallelic markers therein 
have average imer-marker distances of ISOkb or less, 75 kb or less, or 50 kb or less, 30kb or less, or 25kb or less to 
accommodate the fact that in some regions of the genome, the detection of LO requires lower inter marker distances. 

The present invention provides methods to generate biallefic nurkcr maps with average intcr-marksr distances 
of 15Dkb or less. In soma imbodimems, the mean distince between biallelic markers constituting the high density map 
will be less than 75kfa, preferably less than SOkb. Further preferred maps according to the present invention contain 
tnarkers that an less than 37.5kb apart In highly preferred emboifiments, the average inter-marker spacing for the 
bialleCc markers constituting wy high density maps is less than 30kb, most preferably tess than 25kb. 

Genetic maps contavung Iriatleltc markers (including the 653 bralteric markers obtained above, which include the 
sequences of SEQ ID Nos. 1-50 and 5M0O or the sequences ctHnplemantary thereto) may be used to identify and 
isolate genes associated with detectable uaits. The use of the genetic maps of the present invention is described in 
more detaO baluw. 

Use of the Hioh Densitv Biallelic Marker Mao to Identify 
Genes Anociated with a Detectable Trait 
One embodiment of the present invention comprises methods f nr identifying and isolating genes associated 
with a detectable trah using the biafleKc marker maps of the present invention. 

k the past the identification of genes btked with detictable traits has relied on a statistical approach called 
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nnkage analysis. Linkage analysis is based upon establishino a correlation between the transmission of genetic markers 
and that of a specific trait throuohout generations within a family. In this approach, all members of a series of affected 
families are gcnotyped with a few hundred markers, typically microsatcllite markers, which arc distnbutcd at an average 
density of one every 10 Mb. By comparing genotypes in all family members, one can attribute sols of alleles to parental 
haptoid genomes (haplotyping or phase determination). The origin of rccombined fragments is then dotormincd in the 
offspring of att famiiei. Those that co-segregate with the trait are tracked. After pooling data from ait families, 
statistical methods are used to determine the likdihood that the marker and the trait are segregating independently in all 
families. As a result of the statistical analysis, one or several regions having a high probabifity of harboring a gt^nc linked 
to the trait are selected as candidates for further analysis. The result of linkage analysis is considered as significant {i.e. 
there is a high probability that the region contains a gene involved in a detectable trait) when the chance of independent 
segregation of the marbr and the trait is lower than 1 in 1000 (expressed as a LOD score > 3). Generally, the length 
of the candidate region identified using nnkage analysis is between 2 and 20h/lb. 

Once a candidate region is identified as described above, analysis of recombinant individuals using additional 
markers allows further delineation of the candidate linked region. 

Linkage analysis studies have generaQy relied on the use of a maximum of 5,000 microsatellite markers, thus 
limiting tfic maximum theoretical attainable resohition of linkage enaiysis to ca. 600 kb on average. 

Linkage analysis has been successfully applied to map simple genetic traits that show dear Mcndctian 
inheritance patterns and which have a high penetrance {penetrance is the ratio between the number of trait positive 
carriers of allele a and the total number of 8 carrias in Ihe population). About 100 pathological trait-causing genes were 
discovered usinQ linkage analysis over the last 10 years. In most of these cases, the majority of affected individuals had 
affected relatives .end the detectable trait was rare in the general population Ifrequendes less than 0.1%). In about 10 
cases, such as Alzheimer's Oiseaso* breast cancer, and Type II diabetes, the detectable trait was more common but the 
aQete associated with the lietectabte trait was rare In the affected population. Thus, the alleles associated with these 
traits were not responsible for the trait in an sporadic cases. 

Linkage analysis stiffers from a variety of drawbacks. First, linkage analysis is Gmited by its reliance on the 
choice of a genetic model stihebto for oach studied trait Furthermore, as atreedy mentioned, the resohition attainable 
using Rnkege aaatysis is limited, and complementary studies are reqinred to refine the analysis of the typical 2Mb to 
20Mb regbns initially identified through finkage analysis. 

In adition* fuikage analysis approaches have proven difficult when applied to complex genetic traits, such as 
those due to the combined action of muhipte genes andlor environmental factors. In such cases, too large an effort and 
cost are needed to recniit the adeqt0te number of affected fandies required for applying rnkago analysis to these 
situations, as rttcntiy (Bscussed by Itisch, N. and Merikangas, IC &mc» 273:1516-1517 (19961, the disclosure of 
which is incorporated herein fay reference). 

FtnaOy, Gnkage analysts cannot be eptiGed to the study of traits for virhlch no large informative f emilies are 
available, TypicaSy* this wiD be the case in any attempt to identify trait-causing alleles involved in sporadic cases, such 
as alleles associated with positive or negathre responses to drug treatment 
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The present geACtrc maps and bialleOc markers (indudng tKe 653 biailefic markers obtained above, which 
inckide the sequences of SEQ 10 Nos. 1-50 and 5M00 or the sequences conrtplcmentarY thereto) may be used to 
identify and isolate genes associated with detectable traits using association studies, an approach which does not 
require the use ol affected famtltcs and which permits tlic identification of genes associated with sporadic traits. 

Association studies art described in more detail beJow. 

Association Studies 

As already mentioned* any gene responsible or partly responsible for a given trait wit! be in LD with some 
fianb'ng markers. To map such a gene, specific alleles of thsse flanking markers which arc associated with the gene or 
genes responsible for the trait are ideniificd. Although the folluwing discussion of techniques for fimling the gane or 
genes associated with a particular trait using Gnkagc disequilibrium mapping, refers to locating a single gene which is 
responsible for the trail it wiQ be appreciated that the same techniques may also be used to identify genes which are 
partially responsible for the trait 

Association studies may be conducted within the Qcneral population (as opposed to the linkage analysis 
techniques discussed above which are imited to studies parformed on related individuals in one or several affected 
families). 

Association between a biaHelic mat kcr. A and a trait T may primarily occur as a result of three possible 
relationships between the biallelic marker and the trait. 

First allele a of bialteOc marker A may be directly responsible for trait T (eg.. Apo E €4 site A and Alzheimer's 
disease). However, since tht majority of the biaUcfic markers used in genetic mapping studies are selected randomly, 
they mainly map outside of genes. Thus, the fikelihood of aUcIc a being a functional mutation directly related to trait T is 
very low. 

Second, an association between a biallelic marker A and a trait T may also occur when the biallelic marker is 
very closely Gnked to the trait locus. In other words, an association occurs when allele j is in linkage disequilibrium with 
the trait'Causing aOele. Whan the biallelic marker is m dose proximity to a gene responsible for the trait more extensive 
genetic mapping will tittimatety aOow a gone to be discovered near the marker locus which carries mutations in people 
with trait T the gene responsitle for the trait or one of the genes responsible for the trait). As wilt be father 
exempGf ted beknv, usmg a group of bialtefc markers which are in close proximity to the gene responsible for the trait the 
location of the causal gene can be deduced from the prof la of the association curve between the biallelic markers and 
the trait The causal gene will usually be found in the vidnity of the marker showing the highest assodation with the 
trait 

Rnally, an assodation between a btalelic marker and a trait may occur when people with the trait and people 
without the trait correspond to genetically different stdtsett of the population who, coinddentally, also differ in the 
frequency of aRele 9 (population stratification). This phenomenon may be avoided by using ethnicaUy matched large 
heterogeneous samples. 

Association studies are particularly suited to the efficient identification of genes that present common 
polymorphisms, and are invoWed in mdtifactorial traits whose frei)uency is relatively higher than that of diseases with 
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Association studies mainly consist of four steps: recruitment of trait-positivB (T^l ami trait-negative (I ) 
populations with wdl-dcfincd phenotypes. identification of a candidate rcQion suspected of harboring a trait causing 
Qcnc* identification of said gene amonQ candidate genes in the region, and finally validation of mutationls) responsiUlc for 
the trait in said trait causing gene. 

In a first step, trait> and trait - phenotypcs have to be wetl-tlcfincd. In order to pcriorm efficient and 
significant association studies such as those described herein, the trait under study should preferably follow a bimodal 
distribution in the population under study, presenting two clear non-overiapping phcnotypes. trait ^ and trait 

Nevertheless, in the absence of such a bimodal distribution (as may in fact be the case for complex genetic 
traits), any genetic trait may still be analyzed using the association method proposed herein by carefully selecting the 
individuals to bo included in titc trait and trait - phenotypic groups. The selection procedure involves selecting 
individuals at opposite ends of the non-bimodal plutnotype spectrum of the trait under study, so as to include in these 
trait and trait - populations indhriduals who ctcariy represent non-overlapping, preferably extreme phenotypes. 

The definition of the inclusion criteria for the trait and treit - populations is an important aspect of the 
present invention. The selection of those drastically different but relatively uniform phenotypes enables efficient 
comparisons in association studies and the passible detection of mariced differences at the genetic level, provided that 
the sample sizes of the populations under study are significant enouglu 

Generalty, trait and trait - populations to be included in association studies such as those proposed in the 
present invention consist of phenotypically homogeneous populations of individuals each representing 100% of the 
corresponding phenotype if the trait distribution is bimodal. if the trait distribution is non bimodal trait ^ and trait - 
populations consist of phenotypicallY uniform populations of individuals representing each between 1 and 98%. 
preferably between 1 and 80%, mora preferably between 1 and 50%. and more preferably between 1 and 30%. most 
pfeferably between \ and 20% of the total population under study, and selected among individuals extiibiting non- 
overtapping phenotypes. bi some embodiments, the T' and T groups consist of individuals exhibiting the extreme 
phenotypes withtn the sttMlied population. The clearer the difference between the two trait phenotypes, the greater the 
probabitty of detecting an association with biallefic markers. 

In preferred embotfinmts, a first anutp of between 50 and 300 trait t individuals, preferably about 100 
individuals, are recruited according to thetr phenotypes. In each case, a similar number of trait negative individuals are 
inchnfed in such studies who are preferably both ethnically- and agfrmatched to the trait positive cases. Both trait and 
trait • indhriduals should correspond to unrelated cases. 

Figure 3 shows, for a series of hypothetical sample sizes, the p-value significance obtained in association 
studies performed using inifividual markers from the high^nsity biaUelic map. according to various hypotheses regarding 
the difference of aDaGc frequencies between the T-^ and T- samples. It indicates that in all cases, samples ranging from 
150 to 500 intfividuals are numerous enough to achieve statistical significance, tt will be appreciated that bigger or 
smaller groups can be used to perform assodation studies according to the methods of the pruent invoitmn. 

In a second step, a markerltraH association study is performed that compares the genotype frequency of each 
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bialfelic marker in the above described 1^ and T* populations by mearu of a chl square statistical test (one degree of 
freedom), tn addition to this singto marker association analysts, a tiaplotype association analysis is performed to define 
the frequency and the type of the ancestral carrier haplotype. Haplotypi analysis, by combining the inf ormativencss of a 
set of bialleOc markers increases tifc power of the association analysis, allowino false positive and/or negative data that 
5 may result from the stnele markir studios to be eliminatei 

Genotyping can be performed using the mtcrosequescing procedure described in Example 13. or any other 
genotyping procedure suitable for this intended purpose. 

If a positive association with a trait is identified using an array of biallolic markers having a lilyh cnounh 
density, the causal gene will be physically located in tlie vicinity of the essociatcd markers, since the markers showing 

1Q positive assodalton with the trait ere in inkage dtscqutKbiium with the trait locus. Regions baiboring a gene rcsponsibto 

for a particular trait which are identified through association studies using high dciuity sots of biatlclic markers will on 
average, be 20 • 40 times shoner in length than those identified by linkage analysts. 

Once a posithre association is confirmed as described above, a third step consists of completely sequencing the 
BAC inserts harboring the markers identined in the association analyzes. These BACs arc obtained through scicuiiing 

15 human genomic libraries with thi markers probes and/or primers, as described above. Once a candidate region has been 
scquonced and analyzed, the functional sequences within the candidate region (e.g. exons, spSce sites, promoters, and 
other potential regulatory regions) ire scanned for mutations which are responsible for the trait by comparing the 
sequences of the functional regions in a selected number of and T- individuals using appropriate software. Tools for 
sequence analysis are further described in Example 14. 

20 Fmally, candidate mutations arc then validated by screening a larger population of T+ and 

T- individuals using genotyping techniques described below. Polymorphisms arc confirmed as 
candidate mutations when the validation population sliows association results compatible with those 
found between the mutation and the trait in the test population. 

In practice, in ordi^ to define a region bearing a candidate gene, the trait ♦ and trait • populations are 

25 genotyped using an appropriate number of biiUaftc markers. The maricers may include one or more of the 653 markers 
obtained above {which include the sequences of SEQ ID Nos: 1-50 and 51-100 or the sequences complementary thereto. 

The markers used to defmo a region bearing a candidate gene may be distributed at an average density of 1 
merker per, 10-200 kb. Prefvably, the markers used to define e region bearing a candidate gene are distributed at an 

30 average density of 1 mukm every 15-150 kb. In further prefened embodiments, the markers used to define a region 

bearing a candidate gsoe are disuibuted at an average density of 1 marker evecy ZO-IOQkb. In yet another preferred 
embodiment the marken used to define a region bearing a candidate gene are distributed at en average density of 1 
marker every 100 to ISOidL bi a further highly preferred embodknent, the markers used to defme a region bearing a 
candidate gene are (fistnboted et en overage density of 1 merker every 50 to lOOkb. \n yet another embodiment, the 

35 bialleCc maricers used to defme a region bearirrg e candictete gene are ifistrQuited at an average density of 1 marker every 
25-50 kilobases. As mentioAed above, in order to enhance the power of linkage disequtfibrium based maps, in a preferred 
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embodiment the marker density of the map will be adapted to take the knkaoc disequilibrium distribution in the Qcnomic 
region of interest into account. 

In some embodinnents, the initial idtmtification of a candidate genomic region harboring a gene associated with 
a detectable phenotype may be conducted using a preliminary map containing a few thousand biallctic markers. 
Thereafter, the genomic region hdrboriog the gini responsible for the detectable trait may be better denneatcd using a 
map contauiino a larger number of biaOelic nvk&s. furthermore, the ganomic region harboring the gene responsible for 
the detectable trait may be furtlier delineated using a high density map of biallelic markers. Finally, the gene associated 
with the detectable trait may be identified and isolated using a very high density biatlulic marker map. 

Example 1 1 desaibcs a hypothetical procedure for identifying a candidate region harboring a gene associated 
with a detcctabte trait It will be appreciated that although Example 1 1 compares the results of analyzes using markers 
derived from maps having 3,00Q« 20,000, and 60,000 markers, the number of markers contained tn the map is not 
restricted to these exemplary figures. Rather, Example 1 1 exemplifies the increasing refinement of the candidate region 
with increasing marker density. As increasng numbers of markers arc used in the analysis, points In the association 
analysis become broad peaks. The gene associated with the detectable trait under mvesttgation will lie within or near 
the region under the peak. 

Example 1 1 

Identification of a Candidate Rcnion Harhorino a 
Gene Associated with a Detectable Trait 
The initial identification of a candidate genomic region harboring a gene associated with a detectable trait may 
be conducted using a Qenome wide map comprising about 20.000 biallclic markers. The candidate genomic region may 
be further defmed using a map having a higher marker density, such as a map comprising about 40,000 markers, about 
60,000 markers, about 60,000 markers, about 100,000 markers, or about 120,000 markers. 

The use of high density maps such as those described above allows the identification of genes which arc truly 
associated with detectable traits, since thi cotnddemat associations will be randomly distributed along the genome 
while tbe true associations wi map withm one or more discrete genomic rcgttms. Accordingly, bialleTtc markers Seated 
in the vicinity of a gene associated with a detectable uait wiQ give rise to broad peaks in graphs platting the frequencies 
of the biatkific markers in T-^ individuals versus T- indhriduals. In conuast biaVelic marlcers which are not in the vicinity 
of the gena associated with the detectable tr^ wil produce oniqui points in such a plot. By determimng the 
association of several markers within the region containing the gene associated with the detectable trait the gene 
associated whh the detectable trait can be identified using an association curve which reflects the difference between 
the allele frequencies within the T-*- and T< popUations for each studied nurker. The gene associated with the 
detectable trait wi be found in the vickuty of the marker showing the highest association with the trait 

Figures 4, 5, and 6 ilustrate the above principlas. As ahistrated in Figure 4, an as'sodation analysis conducted 
with a map comprising about 3,000 iMfleCc markers yields a group of points. However, when an association analysis is 
performed osmg a denser map which includes ad^tiMial bialtelic markers, the points become broad peaks indicative of 
the location of a gene associated with a detectable ttait For example, the biallelic markers used in the initial association 
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analysis may be obtained from a map comprising about 20,000 biillctic markers, as illustrated in Figure 5, In some 
embodiments, one or more of the 653 biallelic markers obtained above (which indude the sequences of SEQ 10 Nos. 1-50 
and 51-100 or the scqutmccs complementary thereto) arc used In the association analysis. 

In the hypothetical example of Figure 4, the association analysis with 3,000 markers suggests peaks near 
markers 9 and 17. 

Next< a second analysis is performed using additional markers in the vicinity of markers 9 and 17, as itlustraied 
in the hypothetical aample of Figure 5, using a map o( about 20.000 markers. This step aoain indicates an association 
in the close vicinity of marka 17, since more markers in this region show an associatioo with the trait. However, none 
of the additional markers around marker 9 shows a slgnilicant association with the trait which makes marker 9 a 
potential false positive. In some embodiments, one or more of the 653 biallelic markers obtamcd above (which include 
the sequences of SEQ 10 Nos. 1-50 and 5M0Q or tlic sequences complementary thereto) are used in the second 
analysis. In order to further test the validity of these two suspected associations, a third analysis may be obtained with 
a map comprising about 60,000 biaOetic markers. In some embodiments, one or more of the 653 biallelic markers 
obtained above are used in the ttiird association analysis. In the hypothetical example of Figure G, more markers lying 
around marker 17 exhibit a high degree of association with the detectable trait. Conversely, no association is confirmed 
in tha vicinity of marker 9. The genomic region sunounding marker 17 can thus be considered 1 candidate region for the 
hypothetical trait of this simulation. 

The statistical power of LD mapping using a high density marker map is also reinforced by complemonting the 
single point association analysis described above with a multi-marker association analysis, called haplotype analysis. 

When a chromosome carrying e disease allele is first introduced into a population as a result of either mutation 
or migration, the mutant allele necessarily resides on a chromosome having a unique set of linked markers: the ancestral 
haplotype. As efready mcotioncd, a haplotype association analysis allows the frequency and the type of the ancesual 
carrier haplotype to be def ined. 

A haplotype analysis is performed by estimating the frequencies ol all possible haplotypes for a given set of 
biaOeGc markers m the T^ and T- populations, and comparing these frequencies by means of a chi square statistical test 
(one degree of frtedond. Haplotype estimations are usually performed by applying the Expectation-Maximization (EM 
algorithm (Excoffier I and Slatkin M. Md BioL Evt^. 12:921-927 (1995). the disclosure of which is incorporated herein 
by refaence), usii^ the EM-HAPID program (Hawley ME, Pakstis AJ h Kidd KK, Aai J, Phys. AntbropoL 1 B:104 
(1994). the (fisdosure of which is tncorporated herein by reference). Tlie EM algorithm is used to est'imata haplotype 
frequencies in the case when only genatype data from unrelated individuals are available. The EM algorithm is a 
generalzed iterative maximum likelihood apftfoach to estimation that is useful when data are ambiguous andlor 
incomplete. 

To improva the statistical power of the nifividuat marker associatkm analyses contfucted as described above 
using tnaps of increasing marker densities, haplotype studies can be perf omned using groups of markers hicated in 
proximity to one another within regions of the genome. For example, using the methods described above in wtuch the 
association of an iodhrtdua} marker with a detectable phenotype was analyzed using maps of 3,000 markers, 20,000 



wo 99/04038 PCT/IB98/0n93 

markers, and 60,000 markers, a urias of haplotypc studias can be performad using groups of contiguous markers from 
such maps or from maps having higher trarker densities. 

In a preferred embodiment, a scries of successive haplotypa studies including groups of markers spanning 
regions of more than 1 Mb may be performed. In some embodiments, the biaflelic markers included in each of these 
5 groups may t>c located within a genomic region spanning lass than Ub, from 1 to 5kb, from 5 to lOkb. from 10 to 25kb, 
from 25 to SOkb, from 50 to ISOkb, from 150 to 250kb, from 250 to SOOkb. from 5Q0kb to 1Mb, or more than 1Mb. 
Preferably, the genomic regions containing the groups of bialtclic markers used in the successive haplotypa analyses are 
overlapping. It will be appreciated that the groups of biallclic markers need not comptiteiy cover the genomic regions of the 
above spccincd lengths but nuy instead be obtained from incomplete contigs having one or mora gaps therein. As discussed 
10 in funher detail bdow, biaOeiic markers may be used tn single point and haplotype association analyses regardless of the 

completeness of the conBsponding physical contig harboring them. 

Without wishing to be Hmtted to any particular numerical value, it is bcricvid that those haplotypes displaying a 
coefficient of relative risk above 1, preferably about 5 or more, preferably of about 7 or more are indicative of a 
'significant risk' for the individuals carrying the identified haplotype to develop the given trait. However, it is difficult to 
15 evaluate acotfstely quantified boundaries for the so-called 'significant risk'. Indeed, and as it has been demonstrated 
prewously, several traits observed in a given population are multifactorial in that they are not only the result of a single 
genetic predisposition but aUo of other factors sudi as environmental factors. Thus, the evaluation of a significant risk 
must take these parameters into consideration in order to. In a ccftain manner, weigh the potential bnportance of 
external parametets in the development of a given trait. Thus, the relative risk which constitutes a 'significant risk' to 
20 develop a given trait is evakjated differently depending on the trait under consideration and the populations tested. 

Genome wide mappmg using assodation studies whh dense enough arrays of markers permit a case by-case 
best estimate of p-value significance thresholds. Given a test population comprising two ettinicaHy matched trait 
positive and trait negative groups of about 50 to about 500 individuals or more, conducting the above described 
assodation studies win aUow a p^atue 'cut-off to be established by, for example, analyzing significant numbers of 
25 aDete frequency differMicts or, tn some cases where appropriate, running computer simulations or control studies as 
described in Examples 1 1, 20, and 31. 

For a p-valua above the thresholds a corresponding association between the trait and a studied marker will be 
deemed not sgnificant while for a p-vahie betow such a threshold, said association will be deemed significant. If the p- 
value is significant, the genomic region arround the marker will be further scrutinized for a trait-causing gene. 
30 It is preferred that p-vatua significance thresholds be assessed for each caselcootrd population comparison. 

Both the genetic distance between sampled population-'stratification'-and the dispersion due to random selection of 
samples may indeed inf tuance the p-vahie significance thresholds. 

It win be appreciated that the above approaches may be conducted on any scale fui. over the whole genome, a 
set of chromosomes, a single chromttsome, a particular subchromosomal region, or any other desired portion o1 the 
35 genom^ As mentioned above, once significance thrashotds have been assessed, population sample sizes may be 
adapted as exemplified io Figure 3. 
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Examplfl 12 below illustrates the increase in statistical power brought to an association study by a haplotypc 

analysts. 

Example 1 2 

5 HBDlotvpe Analysis: Identification of hiailclic markers riclineatina 

a nnncmic reoron associated with Alrheimer 's Disease <AD1 

As shown in TaUc 3 within Example 10, at an averaQO map density ol one marker per 40 kb only one marker 
(99-365/344 ) out ot ftva random bialtcltc markers from a ca. 200 kb genomic region around the Apo E gene showed a 
10 clear association to AD (delta allelic frequency iii cases and controls - 18% ; p value - 6.9 £-101. The allelic frequencies 
of the other four random markers were not significantly different between AD cases and controls (p-values ^t E*01). 
However, since linkage dbequilibciun can usually be detected between markers located further apart than an average 40 
kb as previously discussed, ona should expect that* performing an association study with a local excerpt of a biallellc 
marker map covering ca. 2QQkb with an average inter*markcr distance of ca. 40kb should allow the identification of 
1 5 more than one biaOelic marker associated with AD. 

A haplotype analysts was thus performed using tho biallefic markers 99-344M39; 99-355/219; 99*359/308 ; 
99-365/344 ; and 99*366/274 (of SEO ID Nos: 301-305 and 307 31 1}. 

In a first step, marker 99-365/344 that was already found associated with AD was not included in the 
haplotypc study. Only btalclic markers 99-344/439 : 99*355/219 ; 99-359/308 ; and 99*366/274. which did not show 
20 any sigra'ficant association with AD when taken individually, were used. This first haplotype anehrsis measured 

frequencies of all possible two-, three-, or four-marker haplotypcs in the AO case and control populations. As shown in 
Rgure 7, there was one haplotype among all the potential different haplotypcs based on the four individually non 
sigaificant markers Chaploiype 8', TAGG comprising SEQ ID No. 305 which is tho T allele of marker 99*366/274, SEQ 
ID Mo. 301 which is the A allele of marker 99*344/439, SEQ ID No. 303 which is the 6 allele of marker B9-359/306 and 
25 SEQ ID No. 302 which is the 6 allele of marker 99*355/219). that was present at statistically significant different 
frequencies In the AO case and comrol populations (A-12% ; p value - Z05 E-D6). Moreover, a significant difference 
was already observed for a three-marker haplotype mctuded in the above mentioned ^haplotype B* (*haptotype TGG. 
A- 10% : p value - 4.76 £-05). Haplotype 7 comprises SEO ID No* 305 which is the T allele of marker 99*3661274. 
SEQ ID No. 303 which is the G aHele of marker 99*359/306 and SEQ 10 No. 302 which is the G allele of marker 99- 
30 355/219). The haplotype issodation analysis thus dearly increased the statistical power of the indtvtdual marker 
association studies by more than lour orders of magoitude when compared to smgle-marker analysis (from p values ^ E- 
01 for the individual markers - see Tabic 3 • to p value ^ 2 £ 06 for the four-marker 'haplotype 8*). 

The signincance of the values obtained for this haplotype associatien analysis was evaluated by the f ollowtng 
computer simulation. The genotype data from the AD cases and the unaffected centrols were pooled and randomly 
35 allocated to two groups which containtd the same number of individuals as the case/control groups used to produce the 
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data sunvnariied in Figure 7. A four-marker haplotype analysis (93*344/439 ; 99-355/219; 99-359/306; and 99 
366/274) was run on these anificial groups. This experiment was reiterated 100 times and the results aro shown in 
Figure 8. No haplotype among those generated was found for which the p-vatue of the frequency difference between 
both populations was more stgniHcant than 1 E OS. In addition, only 4% of the generated haplotypes showed p- values 
lower than 1 E-04. Since both these p*valuc thresholds are less significant than the 2 E-06 p-volue showed by 
^haplotype 8", this haplotype can be considered significantly associated wiih AO. 

In a second stcp« marker 99-365/344 was included in ttie haplotype analyzes. The frequency differences 
between the affected and non affected populatbns was calculated for all two-, three-* four- or five-marker haplotypes 
involving markers: 99-344/439 ; 99-355/219; 99-359/308; 99-360/274; and D9-3G5/344. The most significam p- 
values obtained in each category of haplotype (involving two« three, four or five markers) were examined dcpendino on 
which markers were invohred or not within the haplotype. This showed that aO haplotypes which included marker 39- 
365/344 showed a significam association with AO (p values in the range of E-04 to £-1 1). 

An additional way of evaluating the significance of the values obtained in the haplotype association analysis 
was to perform a similar AO case-control study on biallelic markers eenerated from 6 AGs containing inserts 
corresponding to genomic regions derived from chromosomes 13 or 21 and not known to be involved in Alzheimer's 
disease. Perfomung similar haplotype and individual assxiation analyzes as those described above and in Example 10 
did not generate any significant association results (all p-vatues for haplotype analyzes were less significant than E-03; 
all p-values for single marker association studies were less significant than E-02). 

The results described in Examples 10 end 12. generated from individual and haplotype studies using a biallelic 
marker set of an avcraoe density equal to ca. 40kb in the region of an Ahheimor's disease trait causing gene, indicate 
that all biallelic markers of sufficient informative content located within a ca. 200 kb genomic region around a TCA can 
potentially be succesfuRy used to localize a trait causing gene with the methods provided by the present invention. This 
conclusion is fuahcr supported by the results obtained through measuting the linkage disequilibrium between markers 
99-365/344 or 99-359/306 and ApoE 4 Site A marker within Alzheimer's patients: as one could predict since LD is the 
supporting basis for association stutfies, (D between these pairs of markers was enhanced in the diseased population vs. 
the control population. In a similar way as the haplotype analysis enhinced the sigmfiunce of the corresponding 
association studies. 

Once a given polymorpbtc she has been found and charecterized as a biallelic marker according to the methods 
of the present inventun. several methods can be used in order to determine the specific ^lele carried by an individual at 
the given polymorphic base. 

In some embodiments, genotyping wUl be appGed to one or more of the markers of SEQ ID Nos: 301-305 and 
307-31 1 or the sequences complementary thereto. In additional embodtmems, genotyping win be applied to the markers 
of SEQ ID Nos. 306 and 312 as well u one or more of the markers of SEQ ID Nos. 301-305 and 307-311. In some 
embodinnenu, genotyping will be appfied to one or more of the 653 tualtetie markers obtained above (which inchide the 
sequences of SEQ ID Nos. 1-50 and 51-100 or the sequences complementary thereto). The present invention further 
contemplates the genotyping of any biallelic marker within the provided maps, inchiding those that are in linkage 
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disequilibrium with the 653 biaRdic roarkars obtained above (which indude the sequences of SEQ 10 Uos. 1-50 and 51- 
ICQ or the sequences camplementiry thereto) or the markers of SEQ (0 Nos. 301*312 or the sequences complementary 
thereto. 

Most genotyping methods require the previous amplirication of a 0^fA region carrying ihc polymorphic sito of 

5 interest. 

The identification of bialtolic markers described previously* allows the design of appropriate otigonudcotidBS. 
which can be used as primers to ampify a ONA fragment containing t]ic polymorphic site of interest and for the 
detection of such potymorpNsms. 

tn particularly preferred embodimenis, pairs of primers of SEQ 10 Nos: 313-318 and 319-324 may be used to 
1 0 Qcncrato ampficons harboring the markers of SEQ ID Nos: 301 •300/307-3 1 2 or the sequences complementary thereto. In 
further embodiments, pairs of amplification primers may be used to generate ampficons harboring the 653 markers 
obtained above (which include the sequences of SEQ ID Nos. 1-50 and 5M00 or the sequences complementary thereto. 
In tiighly preferred embodiments, pairs of the ampfification primers of SEQ ID Nos: 10M50 and 151-200 may be used 
to generate ampticons tiaiboring the markers of SEQ iO Nos: 1-50 and 51-100 or the sequences complementary thereto. 
15 It will be appreciated ttiat amplication primers may be designed having any length suitable for their intended 

purpose, in particular any length allowing their hybridization with a region of the ONA fragment to be amplified. 

It will be further appreciated that the hybridizatioo site of said amplification primers may be located at any 
distance from the polymorphic base to be genotyped. provided said amplification primers allow the proper amplification 
of a DNA fragment carrying said polymorphic site. The amplilication primers may be oligonudeotides of 10« 15, 20 or 
20 more bases in length which enable the ampCfication of the polymorphic site in the markers. In some embodiments, the 

ampfification product produced using these primers may be at least 100 bases in length lie. on average 50 nucleotides 
on each side of the polymorphic base). In other embodiments, the amplification product produced using these primers 
may be at least 500 bases in length (La. on average 250 nucleotides on each side of the polymorphic base). In still 
further embodiments, the ampfification product produced using these primers may be at least 1000 bases in length (i.e. 
25 on aver^ 500 nudeotidts on each side of the polymorphic base). 

The ampGficatioa of polymorphic fragments can be carried as described in Example 6 on DNA samples 
extracted as described in Example 5. 

As already roenttoned. aRde frequencies of blaflelic markers tested *m assodation studies Ondividual or 
haplotype) may be detarmined using nucrosequendng procedures. 
3D A first step m mtcrosaquandng procedures consists in designing miaoscquendng primers adapted to each 

btallefic marker to be genotyped. Microsequendng pivnen hybridize upstream of the polymorphic base to be gcnotyped, 
either with the coding or with the tton^otftng strand. Microsequendng primers may be oligonudeotides of 6, 10, 15, 20 
or more bases m length. Preferably, the T and of the miaosequencing primer is immediately upstream of the 
polymorphic bae of the btallelic marker being gcnotyped* such that upoo extension of the primer, the polymorphic base 
35 Is the ffst base incorporated. Stich mtcrosetptendng primers are induded within the scope of the present invention. 
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tn preffirred embodiment the mtcrosequencing primers an those indicated as features within the sequence 
listings corresponding to markers of SEQ 10 Nos: 325-330/33 1 -336. In some embodiments, the 653 biatlclic markers 
obtained ebove {which include the sequences of SEQ 10 Nos. V50 and 5M00 or tlie sequences comptementarY thereto) 
aro genotyped using appropriate microsequcnctng oligonuclflotides such as those of SEQ 10 Nos. 201-250 or 251-300. 

It wil) be appreciated that the bienelic markers of the present invention may be genotyped using 
roiaosequenctng primers having any desirable length, and hybridizing to any of the strands of the marker to bo tested, 
provided their design is suitable for their intended purpose In soma embodiments, the amprificatinn primers or 
miaoseqtiencing prioicrs may be labetei For example, in some embodiments, the ampfification primers or 
microsequencmg prinners may be biottnylated. 

Typical microsequancing procedures that can be used in the context of the present invention are described in 
Eiomple 13 below. 

Examnle 13 

Genotyping of biallelic markers usinn microscQUcncini! nroccdurcs 
Several microsequencing protocols conducted in liquid phase are well known to those skilled in the art. A first 
possible detection analysis allowmg the allele characterization of the microsequencing reaction products relics on 
detecting fhiorcsecnt ddffTP- extended microsequencing primers after gel electropttorcsis. A first ahcrnativc to this 
approach consists in performing a liquid phase microsequencing reaction^ the analysis of which may be carried out in 
sofid phase. 

For example, the nucroscquenctng reaction may be pcifoimed using S'-biotinylated oligonucleotide primers and 
fhioresccin-dideoxynuclaotides. The biotinylated oligonucleotide is annealed to the target nucleic acid sequence 
tmmediatety adjacent to the polymorphic nucleotide position of interest. It is then specifically extended at its 3' cnd 
fottowing a PGR cycle, wherein the labeled didcoxynucleotide analog complementary to the polymorphic base is 
incorporated. The biotinylated primer b thin captured on a microtiter plate coated with strcptavidin. The analysis is 
thus entirely carried out in a tniaotiter plate format The incorporated ddNTP is detected by a fluorescein antibody • 
alkaSiM phosphatase conivgate. 

bt pmttca this microsequsnctng analysis is performed as follows. 20 ^ of the microsequencing reaction is 
added to 80 fA of capture buffer (SSC 2X. 25% PEG BOOO, 0.25 M Tris pH7.5. 1.8% BSA, 0.05% Tween 20) and 
incubated for 20 minutes 00 % microtiter plate coated with stfeptavidin (Boehringer). The plate is rinsed once with 
washing buffer (0.1 M Tris pH 7.5, 0.1 M NaQ, 0.1% Tween 20). 100 //I of anti fhioriscein antibody conjugated with 
phosphatase alkaline, dEuted 115000 in washing buffer containing 1.8% BSA is added to the microtiter piatc. The 
antibody is incubated on the microtitir plate for 20 minutes. After washmg the miaotiter plate four times, 1 00 pi of 4- 
methylutnbefliftrYt phosphate (Sigma) dButed to 0.4 mglml in 0.1 M dietbanolamine pH 9.6, lOmM MgClj are addei The 
detection of the microsequencing reactkm is carried out on a fluorimeter (Oynatecb) after 20 minutes of incubation. 

As another altemativei sold phase microsequming reactions have been developed, for witich either the 
oGgonudeotide microsequancing primers or the PCR-amptffted products derived from the DMA fragment of interest are 
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immobilized. For cxampier innnobirtzation can be carried out via an interaction between biotinylated DMA and 
strcptavidtncoated microtirratiiw wdts or avidtn-coated polystyrGnc particles. 

As a further alt»nativt« the PGR reaction generating the ampficons to be ganotyped can be performed directly 
in solid phase conditions, f oOoMring procedures such as those described tn WO 96/1 3G09, the disclosure of which is 
incorporated herein by reference. 

In such soBd phase microscquencmg reactions, incorporated ddfVTPs can either be radiolabeled (see Syvancn. 
CSn. Cttim, Acta, 226:225-236 (1994), the disclosure of which is incorporated herein by reference) or linked to 
fluorescein (see Livak and Haiiicr* Hum, MetaL 3:379*365 (1994L the disclosure of which is incorporated herein hy 
reference). The detection of radiolabeled ddNTFs can be achieved through scintillation-basad techniques. The detection 
of fluorescein-Gnked ddNTPs can be based on the binding of antiftuorescein antibody conjugated with alkaline 
phosphatase, followed by incubation with a chromogenic subsv ate (such as p-niirophcnyl phosphate). 
Other possible reporter-detection couples for use in the above miaoscquencing procedures include : 
ddNTP Gnked to dtnitrophenyl (ONP) and antiONP alkaline phosphatase conjugate (see Harju et al., C/m 
ChenrM[UP\ 1):2262-2287 (19931 incorporated herein by reference) 

biotinylated ddNTP and horseradish peroxidaso*conjugated streptavidin with o phenylenediamiitc as a substrate (see 
WO 92i15712« incorporated herein by reference). 

A diagnosis kit based on fhuirescein-linked ddNTP with antiftuorescein antibody conjugated with alkaline 
phosphatase has been commercialized under the name PRONTO by GamidaGen Ltd. 

As yet another altemativt mkrosequencing procedure. Nyren et al. {Ana/. Biochcm, 208:171*175 (1993), the 
disclosure of which is incorporated herein by reference) have described a sofid phase ONA sequencing procedure that 
refies on the detection of DNA polymerase activity by an eniymatic luminomctric inorganic pyrophosphate detection 
assay (EUOAl In this procedure, the PCR-empKfied products arc biotinylated and immobilized on beads. The 
microsequencing primer is annealed and four aliquots of this mixture are separately incubated with ONA polymerase and 
one of the four different ddNTPs. After the reaction, tha resulting fragments arc washed and used as substrates in a 
primer extension reaction with all four dNTPs present The progress of tha DNA-directcd polymarization reactions is 
monitored with the EUQA, Incorporation of a ddffTP in the first reaction prevents the formation of pyrophosphate during 
the subsequent dNTP reaction, tn contrast, no ddNTP incorporation in tha first reaction gives extensive pyrophosphate 
retaasa during the dNTP reaction and this leads to generation of light throughout the ELtOA reactions. From the ELIDA 
results, the identity of the first base after the prkncr is easily deduced. 

It wilt be appreciated that several parameters of the abovenlescribed microsequencing procedures may be 
successfufiy modiried by those skilled in the art without undue experimentation, hi particular, high throughput 
improvements to these procedures may be elaboratedr followiag principles such as those described further below. 

It wdl be further appredaied that any other gcnotyping procedure may be applied to~the genotyping of biallelic 

markers* 
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OncG the csntfidata region has bocn delineated using the high density bialletic marker map, a sequence analysis 
process will allow titc detection of all genes located within said region, together with a potential functional 
characterization of said genes. The tdentiftcd functional features may allow preferred traft causing candidates to be 
chosen from among the identified genes. More bialtclic markers may then be gonerated within said candidate genes, and 
used to perform refined association studies that will support the identification of the trait causing gene. Sequence 
analysts processes arc described in Example 14 below. 



Example 14: Scoiience Analysis 
DNA sequences, such as BAC inserts, containing the region carrying the candidate gene associated with the 
10 detectable trait are sequenced and their sequence is analyzed using automated software which eliminates repeat 
sequences while retaimng potential gene sequences. The potential geno sequences are compared to numerous databases 
to identify potential cxons using a set of scoring algurithms such as trametf Hidden Markov Models, statistical analysis 
models (including promoter prediction tools) and the GRAIL neural network. Preferred databases for use in this analysis, 
the construction and use of which are further detailed in Example 22 bctow, include the following: 

15 

NetGene database: 

This proprietary database contains sequences ol 5' cONA tags, obtained from a number of tissues and cells. 
Currently more than 50,000 different 5* clones representing more than 50,000 different genes are includod in NctGcne. 
The sequences in the NetGene database correspond specifically to the 5* regions of transcripts (first cxons) and 
20 therefore allow mapping of the begimttng of genes within raw genomic sequences. 

NRPU (Non-Redundant Protefn-Unioue) database : 

NRPU is a non-redondant merge of the publicly available NBRF/PIR. Genpept, and SwissProt databases. 
Homologies found with NRPU allow the identification ol regions potemially coding for already known proteins or related 
25 to known proteins (translated exonsl. 



NRgST (NofrRedundant EST databasek 

NREST is a merge of the EST subsection of the publicly available GenBank database. Homologies found with 
NREST allow the location of potentially transcribed regions (translated or non-translated exons). 

30 

NRN (Non-Redundant Nucleic acid datahasel: 

NRN is a merge of GenBank, EMBL and thoir daily updates. 



35 



Any sequence giving a positive hit with NHPU, NREST or an 'excelent* score using GRAIL or/and other scoring 
algorithms b considered a potential functional region, and is then considered a camfulate for genomic analysis. 
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Whiie this first saeeninQ allows the detection of the 'strongest' oxons, a semi automatic scan is further 
applied to the remaining seqtiencos in the context of the sequence assembly. That is. the sequences neighboring a 5* 
site or an exon are submitted to onother round of bioinformatics analysis with modified parameters. In this way. new 
oxon candidates are generated for genomic snalysir 

Ustng tha ^ove procedures, genes associated with detectable traits may be identified. 

Examples 15*23 illustrate the application of the above methods using biallclic markers to identify a gene 
associated with a comploi disease, prostate cancer, within a ca. 450 kb candidate region. AddilonaJ details of the 
identification of the Qeno associated with prostate cancer are provided in the U.S. Patent Applicatton entitled 'Prostate 
Cancer Gene' (GENSET.018A, Serial No. 08/996,306K the disclosure of which is incorporated herein by reference. 



Use of Biallniic Markers to Identify a Gene Associated with Prostate Cancer 
Substantial amounts of LOll data supported the hypothesis that genes associated with distinct cancer types 
are located within a particular region of the human genome. More specifically, this region was likely to harbor a guno 
associated with prostate cancer. Association studies were performed as described below in order to identify this 
prostate cancer gene A YAC contig containing the genomic region suspected of harboring a gene associated with 
prostate cancer was constructed as described in Example 15 below. 

^'amolo 15 

YAC Conttg Construction in tlic Candidnte Genomic Refiion 
First, a YAC contig which contains the candidate genomic region was constructed as follows. The CEPH* 
Genethon YAC map for the entire twman genome (Cbumakov et al. (1895). suprs) was used for detailed contig building in 
the genomic region containing genetic markers known to map in the candidate genomic region. Saccning data available 
for several publicly ava3able genetic markers were used to sefect a set of CEPH YACs localized within the candidate 
regioiu Tliis set of YACs wis tested by PGR with the above mentioned genetic markers as well as with other publicly 
BvaUable markers supposedly located within tha candidate region. As a result of these stutiiesj a YAC STS conttg map 
was Qoneratad around genetic nurkers known to map in this genomic region. Two CEPH YACs were found to constitute 
a minimal t9tng path in this region, with an estimated size of ca. 2 Megabasas. 

During this mapping effort several publicly known STS markers were precisely located within the contig. 
Example 18 below desaibes the identification of sets of btaUeTtc markers witlun the candidate genomic region. 

Example 16 
BAC tontio construction and 
Biaflefe Markers tsotatien within the candidate chromosomal reoion. 
Next, a fiAC contig covering the candidate genomic region was constnicted as follows. BAC libraries were 
obtained as described in Woo et aL, Nuciek Adds Rts. 22:49224931 (1994L the tfisclosure of which is incorporated 
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herein by reference. Brief ty, tho two whole human genome BamHl and Hindlll libraries already described in Example 1 
were constructed using the pBchiBAC1 1 vector (Kim ct aL (1996). supn). 

Tho BAC libraries were then screent rf with all of the above mentioned STSs, following the procedure described 
in Eiamplc 2 above. 

5 The ordered BACs selected by STS screening and verified by FISH, were assembled into contigs and new 

roarlcers were generated by partial sequencing of insert ends from some of thcni. Tlicse markers were used to fill the 
gaps in the contig of BAC clooes covering the candidate chromosomal region having an estimated size of 2 mcgabascs. 

FtQure 9 illustrates a minimal array of overiapping clones which was chosen for further studies, and the 
positions of tho pubiidy known STS marken along said contig. 
10 Selected BAC clones from the contig were subdoned and sequenced, essentially following the procedures 

described in Examples 3 and 4. 

Biaileric markers lying along the contig were identified following the processes described in Examples 5 and 6. 
Figure 9 shows the locations of the biallelic markers along the BAC contig. This Ttrst set of markers 
corresponds to a medium density map of the candidate locus, with an inter-marker distance averaging 50kb*150kb. 
15 A second set of biallelic marken was then generated as described above in order to provide a very high-density 

map of the region identified using the first set of markers which can be used to conduct assodation studies, as 
explained below. This very higti density map has markers spaced on average every 2-50kb. 

The bialldic markers were then used in assodation studies. DNA samples wera obtained from individuals 
suffering from prostate cancer and unaffected intfividuals as described in Example 17. 
20 Example 17 

CoHectifm of DNA Samples from Affected and Non nffeclcd Individuals 
Prostata cancer patients were recruited according to clinical indusion criteria based on pathological or radical 
prostatectomy records. Control cases induded in this study were both ethnically- and age-matched to the affected 
cases: they were checked for both the absence of all clinical and bidogical criteria dcfming the presence or the risk of 
25 prostate cancer, and for tha absma of rdated familial prostate cancer cases. Both affected and control individuals 
ware afi unrdated. 

The two foUowinti t{roup$ of independent individuals were used in the assodation studies. The first group, 
comprising indhridoals suffering from prostate cancer, contained 185 individuals. Of these 185 cases of prostate 
cancer, 47 cases were sporadic and 1 38 cases wera familial. The control group contained 1 04 non-diseased individuals. 
30 HapUttype analysis was conducted using additiond diseased (totd samples: 281) and control samples {total 

samples: 1 30), from nidivitfcjals recruited according to similar criteria. 

DNA was extracted from peripheral veaous blood of all individuals as described in Example S. 
The frequendes of the tnaUenc markers in each popdation were determined as descrStsd in Example IB. 

Example 18 

35 fienotvpino Affected and Comrd Indhriduals 

Genotyping was performed using the foQowtng microsaqutndng procedure. 
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Amplification was perfomud on each DMA sample using primers designed as previously explained. The pairs of primers 
ware used to generate ampncons harborino tf^e biallefic markers 4-26, 4-14, 4-77, 99-217, 4-67, 99-213. 99- 
221, 99-135, 99-1482, 4*73, and 4-65 using the protocols described in Example 6 above. 

Microsequoncing printers were designed for each of the biallclic markers, as previously described. 
After purification of the amplification products, the microsetiucncing reaction mixture was prepared by adding, in a 2(^/1 
final voltfRc; 10 pmot rmcroscquoncing oligonucleotide. 1 U Thermoscqucnass lAmersham E79000GL 1.25 ^1 
TharmosequenasQ buffer (260 mM Iris KCI pH 9.5, G5 iM MgCt^), and the two appropriate fluorescent ddNTPs (Pcrkin 
Elmer, Oye Tenninater Set 401095) complemantary to the nucleotides at the polymDrphic site of each biallclic marker 
tested, following the manufacturer's racommendatians. After 4 minutes at 94^C, 20 PCR cycles of 15 sec at 55*^0, 5 
sec at ll^C, and 10 sec at 94<*C were carried out to a Tetrad PTC-225 thcrmocvcler (MJ Research). The 
unincorporated dye terminators were thon removed by ethanol precipitation. Somples were finally rcsuspcndcd in 
formamide-EDTA loading buffer and heated for 2 min at 95** C before being loaded on a polyacrylemide sequencing gel. 
The data were coflected by an ABI PRISM 377 QNA sequencer and processed using the GENESCAN software (Parkin 
Elmer). 

Following gel analysis, data were automatically processed with software that allows the determination of the 
alleles of biallelic markers present in each amplified fragment 

The software evaluates such factors as whether the intensities of the signals resulting from the above 
microsequcncino procedures are weak, normal or saturated, or whether the signals are ombiguous. In addition, the 
software identifies significant pcab (according to sluipe and height criteria). Among the significant peaks, peaks 
corresponding to the targeted stta are identified based on their position. When two significant peaks are detected for 
the same position, each sample is categorized as homozygous or heterozygous based on the hciglit ratio. 

Association analyzes were then performed using the biallenc markers as described below. 

Example 19 
Association Analysis 

Association studies were run in two successive steps. In a first step, a rough localization of tha candidate 
gene was achieved by determinmo the fr^uencies of the biallelic markers of Rgure 9 in the affected and unaffected 
populations. The nsuhs of tlus rough localization are shown in Figure 10. This analysis indicated that a gene 
responsSile for prostate cancer was located near the biaillelic marker designated 4-67. 

In 8 second phase of the analysis, the position of the gene responstbtc for prostate cancer was further refined using the 
very high density set of markers including the 99-123, 4-26, 4-14, 4-77, 99-217, 4-67, 99-213, 99-221, 99-135, 99- 
1482, 4-73, end 4-65 markers. 

As shown in Ftgure 11, the second phase of the analysis confirmed that tha gene responsible for prostate 
cancer was near the biallelic marker designated 4-67. most probably within a ca. ISOkb region comprising the marker. 

A haptotype analysis was also performed as described in Example 20. 
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Haplotvpe analysis 

The allelic frequencies of each of xh$ alleles of biallcfic markers 99-123, 4*26. 4-14« 477, 09-217, 4-67, 99* 
213, 99-221. and 99-135 wars determined in the affected and unaffected populations. Tabtc 4 lists tfie internal 
identification numbers of the markers used in the haptotypc analysis, the alleles of each marker, the most frequent allele 
in both unaffected individuals and individuals suffering from prostate cancer, (he least frequent allele in both unaffected 
individuals and individuals suffering from prostate cancer, and the frcquancies of the least frequent alleles in each 
population. 

Table 4 

Frequttney of leost froquant allole ** 



Morkory 


Polymorphic base* 


Casai 


Controls 


99-123 


CfT 


0.35 


0,3 


4-26 


A/G 


0.39 


0.40 


4-14 


CfT 


0.35 


0.41 


4.77 


C/G 


0.33 


0.24 


99217 


CfT 


0.31 


0.23 


4-67 


CfT 


0*26 


0.16 


99-213 


TfC 


0.45 


0.38 


99-221 


CfA 


0.43 


0.43 


99135 


AfG 


0.25 


0.3 



most frequent attelellcast frequent allele 
standard deviations - a023 to 0.031 for controls 
•0.018 to 0.021 for cases 



Among afl the theoretical potential ififferent haplotypes based on 2 to 9 markers, 1 1 haplotypcs showing a 
strong issociatioQ with prostate cancer were selected. The results of these haplotype analyzes are shown in Figure 1 2. 

Figures 11, and 1Z aggrogate association analysis results with sequenc'mg results - generated following the 
procedures further describad in Example 21 * which permitted the physical order andior the distance between markers to 
be estimated. 

The significance of the values obtained in Figure 12 are underscored by the following results of computer 
simulations* For the cocnputer simulations, the data from the affected individuals and the unaffected controls were 
pooled and randomly allocated to two groups which contained the same numb& of individuals as the affected and 
imaffftcted groups used to conqiile the data sianmarized in Hgura 1Z A haplotype analysis was run on these artificial 
groups for the six markers indoded in haplotype 5 of Figure 12. This experiment was reiterated 100 times and the 
results are shown in Figure 13. Among 100 iterations, only 5% of the obtained haplotypes are present with a p-value 
less significant than E*04 is compared to the p-vahie of 9^)7 for haplotype 5 of Rgure 12. Furthermore, for haplotype 
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5 of Figure 1Z only 6% of the obtained haplotypes have a slonifrcanct level below 5^-03^ while none of them show a 
significance level below 

Thus* using the data of Figure 13 and evaluating the associations for single marker aOeles or for haplotypes 
will permit estimation of the risk a corresponding carrier has to develop prostate cancer. It will be appreciated that 
5 significance thresholds of relative risks wilt be more finely assessed according to the population tested. 

Diagnostic techniquBS for detemiintno an indiriduars risk of developing prostate cancer may be implemented as 
described below for the matkcrs in the mops of the present invention, inchiding the 99-123, 4*26, 4*14, 4-77, 09*21 7, 
4-87, 99*213. 99*221< and 99*135 markcri 

The above haplotypc analysis indicated that 171kb el genomic ONA between bialldic markers 4-14 and 99- 
10 221 totally or partially contains a gene respoostbie for prostate cancer. Therefore, the protein coding sequences lying 
within this region were charactcriicd to tocote the gene associated with prostate cancer. Tliis analysts, described in 
further detail below, revealed a single protein coding scqttencc in the 171 kb genomic region, which was designated as 
the PG1 gene. 

Example 21 

15 Identification of the Genomic Senucnce in the Condidata fleoion 

Template ONA for sequencng the PG1 gene was obtained as foRowi. 6ACs € and F from Fig. 9 were subcloned 
as previously desaifaed. PlasmkI inserts were frst ampCficd by PCR on PE 9600 thennocydcrs (Perkin-Ekner), using 
appropriate primers, AmpETaqGold (Perkin Etmcr), dNTPs (Boehringer), buffer and cycCng conditions as recommended by the 
Perkin-Efancr Corporatkm. 

20 PCR products were then sequenced using automatic ADI Prism 377 sequencers (Perkin Elmer, Applied Biosystems 

Division, Foster City, CA). Sequendng reactions were performed using PE 9BQ0 thermocycbirs (Perkin Elmer) with standard 
dyoi»rimer chemistry and ThemioScquenasc (Amersham Ufa Science). The primen were labeled w^h the JOE, FAM, ROX 
and TAMRA dyes. The dflTPs end ddNTPs used in the sequencing rBactions were purchased from Boehringer. Sequencing 
buffer, reagent coocentratioiis and cydiig condhions were as reconunendcd by Amersham. 

25 Foltowing tha saqtrntcioo raaction, the samples were predp'ttated with EtOH, rcsuspendad in f orroamide loading 

buffer, and loaded on a standard 4% aoylanvda geL Electrophoresis was performed for 2.5 hours at 3000V on an A61 377 
setpiencar, and tha sequence data were cofiected and analyzed using the AB1 Prism ONA Sequencing Analysis Software, 
verskm2.1^ 

The sequence data obtaiwsl as described above were transferred to a proprietary database, whore quality control 
30 and vaEdation steps wart perf onned. A proprietary base-caller flagged suspect peaks, taking tote account the shape of the 
peaks, tha mter-peak rasahnion, and the noise ML The proprietary base-caller also performed an automatic trimming. Any 
stretch of 25 or fewer bases having mora than 4 suspect peab was considered unrefiable and was discarded. 

Tha sequence fragnents from BAC subclones isolated as described above vma assembled using Gap4 
software from fL Staden (Bonfield et aL 1995). This software allows the reconstmction ol a single sequence from 
35 sequence fragments. Tha sequence deduced from the alignment of d'dferent fragments is caDad tha consensus 
sequence. Directed seqoencino techniques (praner walking) were used to complete sequences and link contigs. 
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Potential functional sequences were than identified as described in Example 22. 

Ex8mDte22 
Identification of Functional Scmicnces 

Potential exons in BAC-deHvcd human genomic sequences were located by ttomology searches on protein, nucleic 
acid and EST (Expressed Sequence Taqsl public databases. Main public databases were locally reconstructed as mentioned 
in Exemple 14. The protein database, NRPU (Non-redundant Protein Unique) is fomicd by a non-redundant fusion of tho 
Genpept (Benson ct aL, Nudw Ackh fta 24:1-5 (1996), the disclosure of which » tncorporated horeiii by rclercnce). 
Swissprot (Bairodv A. and Apweilor, NmJekAckh fles, 24:21-25 (1996), the disclosure of wNch is incorporated herein 
by reforonce) and PIR/NBRF (Ctwroo et eL, ii/txiek Adds fies, 24:17-20 (1996), the disclosure of which is incorporated 
herein by reference) databases. Redundant data were eKminated by using the NRDB software (Benson et al. (1996). suprs) 
and internal repeats were masked with the XNU software (Benson et aU sttpnl jlomolootcs found using the NRPU 
database allowed the identificatioo of scquoiccs corresponding to potential coding exons rdatod to known proteins. 

The EST local database is composed by the gbest section (1-9) of GenBank (Benson et al. (1996)« supra], and thus 
contains ail publicly availabla transcript fragments. Homologies found with this database aQowed the fa)cali2atlon of 
potentially transcrflied regions. 

The local micteic acid database contained aH sections of GenBank and EMBL (Rodriguez-Tome et al, fifuc/m'c Acids 
ftes, 24:6-12 (1996), the disclosure of which is incorporated herein by reference) except the EST sections. Rcdifidant data 
were eliminated as previously described. 

Simaarhy ssarchas m protein or nucleic acid databases were performed using the BLAST software (Altschut ct al, 
J. MoL DtoL 215:403410 (ISSOji the disckmire of which is incorporated herein by reference). Alignments were refined 
using the Pasta software, and multiple angnmenu used Ckistal W. Homology thresholds were stated for each analysis 
based on the length and the complexity of the tested region, as weli as on the size of the reference databasa. 

Potential axon sequences identtftcd a above were used as probes to saaen cONA fibrarics. Extremities of positive 
clones were sequenced and the sequenct stretches wen positioned on the Qenomic sequence determined above. Primers 
were then designed using the results from these afignments in order to enable the cloning of cONAs derhred from the gene 
assodoted with prostata cancer that was kientified using the above procedures. 

The obtained cONA molecules were then sequenced and rssutts of Northern fakit anaty^ of prostate mRNAs 
supported tin existence of a major cDNA having a S*6kb length. The structure of the gene associated with prostate cancer 
was evahiated as described in Example 23. 

Example 23 
Ap^tYg'ttdGeneStTupture 

The intronfaxon structure of the gene was fmaOy completely deduced by aBgning the mRNA sequence from the 
cDNA obtained as described above and the genomic DNA sequence obtained as described above* This alignment 
permitted the datermmation of the positkins of the introns and exons, thi positions of the start and end nucleotides 
defming each of the at least 8 exoiB, the tecations and phases of the S' and T splice sites, the position of the stop 
codon, and the position of the potyadenytitien site to be determned in the genomic seqoenceL This analysis also yielded 
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the poshions of the coding region in the mRNA, and the loeations of the poly adenyla (ion signel and poly A stretch in the 
roRNA. 

The gens identified as disaibed ibovi comprises at least 8 cxons and spons more than 52kb. A G/C rich 
putative promoter region was identified apstrcam of ttic coding sequence. A CCAAT 'n the putative promoter was also 
idonttfnd. Tho promoter rcQion was identiliod as described in Prestridge, O.S^ Predicting Pul 11 Promoter Sequences 
Using Transcription Factor Binding Sites* 1 Mol. BioL 249:923-932 (1995L the disclosure ul which is incorporated 
herein by reference. 

Additional analysis using conventional techniques, such as a 5'RACE reaction using the Marathon-Ready 
human prostate cONA kit from Chintech (Catalog. No. PT1 15C H, may be performed to confirm that tha 5' of tho cDNA 
obtained above is the luthi nttc 5' end in the mRNA. 

Aitemativoly, the S'scquencc of the transcript can be determined by conducting a PCR amplification with a 
sarias of primers extending from thi 5'end of the identified coding region. 

The above methods were also used to identify biallelic markers in a gene which was an anrsctive candidate for 
a gane associated with asthma. Ezaffl|»les 24-31 show bow the use of methods of the present invention allowed this 
gene to ba identified es a gene responsible, at least partiaSy. for asthma in the studied populations. Additional details el 
the identification of the gene associated with asthma are provided in U.S. Provisional Application Serial Nos. 
60/081,893 (6anset.026PR) and U.S. Provisional Patent Application Genset.026PR2, the disclosures of which sre 
incorporated herein by reference. 

Eiamole 24 

Detection of bialleric markers in the candidate oena: ON A eitr action 
Donors were unrelated aod healthy. They presented a sufficient diversity for being ripresantative of a French 
heterogeneous population. The DNA from 100 indhridoals was extracted and tested for the detection of the biallelic 
markers. 

30 ml of psripheral venous blood ware taken from each donor in the presence of EOTA. Cells (pellet) were 
collected after centrifugatkm for 10 minutes at 2000 rpm. Red ceOs were lysad by a lysis solution (50 ml final volume : 
10 mM Trie pH7.6; 5 mM MgCI2; 10 mM NaCO. The solution was centrifuged (10 roinuies, 2000 rpm) as many times as 
necessary te efimmate the residual red cefls present in the supernatant after resuspension of the pellet in the tysis 
solution. 

The peOet of white cells wu fysed overnight at 42" C with 3.7 ml of lysis solution composed of: 

* 3 ml TE 10*2 (Tris HC1 10 mNl EDTA 2 mM) I NaCI 0.4 M 
•200/rfSDS 10% 

* 500 III Kprotonase {2 mg K-proteinase m TE 10-2 1 NaCl 0.4 Ml 

For the txtractton of pnrteins, 1 ml saturated NaCt (6M) (113.5 vM was added. After vigorous agitation, the 
sohitioQ was cantrHugad for 20 minutas at 1 0000 rpm. 

For the preciphation of ONA, 2 to 3 vetamu of 100% ethanol were added to the previous supernatant and the solution 
was cefltrifugad for 30 minutas at 2000 rpm. The OfilA sotuttoo wu rinsed three times with 70% ethanol to eliminate 
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salts* and centrifuQBd fat 20 minutes at 2000 rpm. The pillit was dried at 37**^ and resuspended in 1 ml IE 10-1 or 1 
ml water. The DMA concentration wis evaluated by measuring the 00 at 260 nm (1 unit OD - 50 //g/ml DMA). 

To determine the presence of proteins in the DMA solution, the 00 260 / 00 260 latio was dotenruned. Only 
DMA preparations having a 00 260 1 00 280 ratio botwcen 1.8 and 2 were used in the subsequent examples doscribcd 
below. 

The pool was constituted by mixing equivalent quantities of DMA from each indivlduDl. 

glfffTiplf 2^ 

nctection of the hiallctic marfccrs: amolilicalion ol ocnomic DNA bv PCR 
The amplification of sptctfic genomic sequences of the ONA samples of Example 24 was carried out on the 
pool of DMA obtained previously. In addition. 50 individual samples were similarty amplified. 

PGR assays were performed using the following protocol: 

Final volume 25;y| 

DMA 2ng/^l 

MgCi2 2inM 

dNTP(each) 200 //M 

primorfeach) 2.9nQ/pl 

Ampli Taq Gold ONA polymarase 0.05 unit/p) 
PGR buffer (lOx - 0.1 M TnsHCl pH8.3 0.5M KGll 1x 



Pairs of fint primers were destgnod to amplify the pmmoter region, mm, and 3' end of the candidate asthma- 
associated gene using the sequence information of the candidate gene and the OSP software (Hillier & Green, 1991). 
These first primers were about 20 nucleotides in length and contained a common oligonucleotide tail upstream of the 
specific bases targotod for amplifkatton which was useful for sequencing. The synthesis of these primers was 
25 performed foltowing the phosphoramidite method, on a GENSET UFP$ 24.1 synthesizer. 

DNA amplification was performed on a Genius II thermocyder. After heat'mg at 94**C for 10 mm, 40 cycles 
were performed* Each cycle comprised: 30 sec at 94''C, 55'G for 1 min, and 30 sec at 72*'G. For final elongation, 7 min 
at 72**C ended the amplification. The quantities of the amplification products obtained were determined on 96-well 
microliter plates, using a fhiorometer and Picogreen as intercalant agent (Molecular Probes). 
30 Eiamole 26 

Detection of the biallelic mirfcerst seouencino of amoKfiod genom ic DNA and identification of Dolvmprphisms 
The sequencing of the amplified ONA obtained in Exampte 25 was carried out on ABl 377 sequencers. The 
sequences of the amplification products were determined using automated dideoxy termmator sequencing reactions with 
a dye terminator cycle sequencing protocol The producu of the sequencing reactions were run on sequencing gels and 
35 the sequences were analyzed as formerly described. 
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The sequence data were further evaluated using the above mentioned pol/morphism analysis software 
designed to detect the presence of faiaUiUcfliarlcars among the pooled amplified fragments. The polymorphism search 
was based on the presancs of superbnposid peaks in the electrophoresis pattern resulting from different bases occurring 
at the same position as described previousiy. 
5 Six fragments of amplification were analyzed. In these segments. 6 blaltotic markers were detected* The 

localization of the biailcGc markers, the polymorphic bases of each allcla, and the frequencies of the mast frequent 
alleles was as shown in Table 5. 



Tahlo 5 

10 





Amplicon 


lUlarkerNama 


Origin ofONA 


Localization in 
gone 


Polymorphism 


Fraqusncy 




1 


204/326 


Ind. 


Promoter 


A/G 


96,2 (G) 




2 


32/357 


Pool 


Intron 1 


A/C 


67 J (CI 


15 


3 


33/175 


Ind. 


ExDn2 


C/T 


97.3(C) 




3 


33/234 


Pool 


Intron 2 


A/C 


56.7 (C) 




3 


33/327 


Ini 


Iniion 2 


err 


75.3(1) 




5 


35/358 


Pool 


Intron 4 


C/G 


07.9(0) 




5 


35/390 


Ind. 


Intron 4 


C/T 


32(0 


20 


6 


36/164 


Ind. 


ExonS 


m 


99.5(G) 



Alle&c frequenctes were determined in a population of random blood donors from French Caucasian origin. Their wide 
range is due to the fact that besides screening a pool of 100 mdividuals to generate biaOelic markers as described 
above, potymorphism searches were also conducted in an individual testing format for 50 samples. TTiis strategy was 
25 chosen here to provide a potmtial shortcut towards the identification of putative causal mutations in the association 
studies using them. As the 36/1 64 bialeltc marker was found in only one indWidual this marker was not considered in 
the association studies. 

The fourth fragment of antpCficatJon carrying exon 3 (not shown in the Table) was not polymorphic in the 
tested samples (1 pool * 60 inifivlduats). 
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Example 27 

Vniiriation of the Dolvmorphisms throunh microsEiiuendnn 
The faiallslic markers identilicd in Example 26 were further confirmed and their respective frequencies were 
dotcrmined through mlcrosoqucnong. Microscquencing m% canried out for each individual DNA sample described in 
5 Example 24. 

AmpUficaUon from genomic ONA of individuals m% performed by PCR as dusciilied above foi the detection of 
ttiG biallelic markers with the same set of PCR primors described above. 

The preferrod primers used in microscquencinQ iiad about 19 nuclootides in length and hybridized just upstream 
of the considered polymorphic baso. 
10 Five primers hybridized with the non-coding suand of the gene. For the biaffciic markers 204/326, 3S/3511 and 36/164. 

primers hybridized with the coding strand of tiic gene. 

The microsequcncing reaction was performed as described in Example 18. 

Eyamnlo 2B 

Association study between asthma and the bialielic markers of the candirinte oenc: colloction of DNA sampies from 
15 affected and non-affected individuals 

Tlie asthmatic population used to perform association studies in order to establish whether the candidate gene 
was an asthma-causing gene consisted of 298 individuals. More than 90 % of these 298 asthmatic individuals had a 
Caucasian ethnic background. 

The control population consistod of 373 unaffected individuals, among which 279 French (at least 70 % were 
20 of Caucasian origin} and 94 American (at least 90 % wore of Caucasian origin). 

ONA samples were obtained from asthmatic and non-asthmatic individuals as described abovo. 

Eiamolc 29 

Association study between asthma and the bialleiic markers nf the candidate cene: nenotvoinn of nffericd and rnntrol 

individuats 

25 The general strategy to perform the association studies was to mdividuatly scan the DNA samples from all 

individuals in each of the populations described above in order to establish the allele frequencies of the above described 

biatleftc markers m each of these papulations. 

AMc frequencies of the ebove^escribed biallelic markers in each population were doterminod by performing 

nucrosequdncmg reactions on amplified fragments obtained by genomic PCR performed on the DNA samples from each 
30 individual Genomic PCR and microsequencing were performed as detailed above m Examples 25 and 27 using the 

described amplification and microsequancino primers. 

Eiample 30 

Association study between asthma and the biallelic markers of the candidate nenc 
Tabid 6 shows the results of the association study between ftvo biallertc markers in the candidate gene and 

35 asthma. 
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AltoHc frequencies (%) 



Markers 


Asthmatics 
298 individuals 


Controls 
373 individuols 


Frequency dift. 


P value 


32/357 


A 38.6 


A 20.8 


8.8 


7.34k lO"* 


33/234 


A 49 


A 44.3 


4.7 


6.86x10^ 


33/327 


T70.5 


T74.6 


3.9 


l.OxlO' 


35/358 


G72.3 


G68.9 


5.4 


3.5Dx10*^ 


35/330 


T30.4 


T20.3 


iai 


2J3i10' 



^0 

As shown in Tabic 6, markers 32/357 and 35/390 prcsontod a strong association with asthma, this association being 
highly significant I pvalue - 7.34x10-4 for marker 32/357 and 2.33x10-5 far marker 30-390). 

Three markers showed moderate association when tested independently, namely 33/234, 33/327, 35/358. 
15 It Is worth mentienino that allelic frequencies for each of the biallelic markers of Table 6 ware separately 

measured within the French control population (279 individuals) and the American cuntrul population (04 individuals). 
The differences in allele frequencies between the two populations were between 1 % and 7%, with p*valucs above 10'\ 
These data confirmed that the combined FrenchlAmerican control population [373 individuals) was homogeneous enough 
to be usad as a control population for the present association study. 

20 

Ejtwie 31 

Association studies: Hanlotvne freouencv analysis 
As already shown, one way of increasing the statistical power of individual markers, is by performing 
haplotype association analysis. A haphtlype analysis (or association of markers in the candidate gone and asthma was 
25 performed by estima^o the frequencies of all possible haptotypes for biailolic markors 321357, 33/234, 33/327, 35/358 
and 35/390 in the asthmatic and control populations desaibed in Example 30 riable 6), and comparing those frequencies 
by means of a du square statistical test (one degree of freedom). Hapkjtype estimations were performed by applying the 
Expectation-Maximczation (EM) algoiithm (Excoffier L & Slatkin 1995, Mol.Biol.EvoL 12 :921-927), using the EM- 
HAPLO program (Hawley ME, Pakstis AJ & Kidd KK, 1994, Am.J.Phys.Anthropol. 18 : 104). 
3D The results of such haplotype analysis are shown in Table 7. 



35 
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Tabic 7 



tiaplutypu 

5 frequencies 



Markers 


321357 


331234 


33027 


3SI356 


35f330 


Asllim. 


Controls 


Oililt ratio 


P voius 


FrMiuoney did. 


8.B 


4.7 


3.9 




10.1 












7.34110'* 








2.331 lO'' 










lloplotYp* 1 


A 








T 


a2 


0.11 


2.02 


8.47x10* 


1 0 lldplotYpft 2 




A 


T 


G 




0,27 


0.18 


1.C8 


2.SI1IO'* 


llopiotypt 3 


A 


A 


T 


G 


T 


0.16 


0.09 


Z22 


3.1J5)r10^ 



A two-marker haplotype covering markers 32/357 and 35/390 (haplotypc 1, AT alleles r8:pectivcly} presented 
15 a p value of 8.47x10-6, an odds ratio of 102 and haplotypc irequencies of D.2 for asthmatic and 0.1 1 for control 

populations respectively. 

A three-marker haplotype covering markers 33/234, 33/327 and 35/358 (haplotypc 2. ATC alleles respoctively) 
presented a p value of 2.81x104. an odds ratio of 1.66 and haplotype frequencies of 0.27 for asthmatic and O.lO for 
control populations respectively. 

20 A five-marker haplotype covering markers 32/357, 331234, 33(327. 35/35B and 35/350 (haplotype 3. AATGT 

alleles respectively) presented a p value of 3.95x10-5, an odds ratio of 2.22 and haplotype frequencies ol 0.1 8 for 
asthmatic and 0.09 lor control populations respectively. 

Haplotype associatron analysis thus incrftased tho statistical power of the individual marker association 
sttidies when compared to sinote-markir analysi) (from p values between 10^ and 2X10'^ for the individual markers to p 
25 values between 3X1 0"^ and 8X1 0"* for the thrie-marker haplotype, haplotype 2|. 

The stgnmcance of the tratues obtained for the haplotype association analysis was evaluated by the foUowino 
computer simulation tast. Tha ganotype data from the asthmatic and control individuals were pooled and randomly 
allocated to two Qronps which contanied the same number of individuals as the trait positive and trait negative groups 
used to produce the data summarizBd in Table 7. A haplotype analysis v^as then run on these artificial groups for the 
30 three haplotypes presented in Table 7. This experiment was reiterated 1000 times and the results are ^own in Table 8. 
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Table 8 



Pormutation Test 



Hoplotvpe 
tioplotvps 1 



Cfai-Square 



AvcroQC Chi-Squaro 



Maximal Chi*Squaro 



P value 



t9J0 



1.2 



11.C 



Haplotypc Z 
(ATC.) 
HaptotYpe 3 



13.49 



1.2 



10.5 



1.0x10 



(AATGT) 



16.GG 



1.2 



9.3 



1.0x10 



The results in Table 8 show that Qmong 1000 iterations onfy ]% of the obtained haplotypus lias a pvoluc 
comparable to the one obtained in Tablo 7. 

These results dearty validate the statistical signillcancs of tho haplotypcs obtained Ihaplotypos 1, 2 and 3, 

Table?). 

While Examples 15-31 illustrate the use of the maps and markers of the present invention for idantifyiiio a nes 
gene assodated with a complex disease within a 2Mb Qenomic region for establishing that a candidate Qenc is, at least 
partially, responsible for a disease, the maps and markers of the present invention may also be used to identify one or 
more bianelic markers or one or more genes assodated with other detectable phenotypes, induding drug response, drug 
toxidty, or drug efficacy. The biallclic markers used in such drug response analyses or shown, using the methods of the 
present invention to be assodated with such traits, may lie within or near genes responsible for or parti/ responsible for 
a particular disease, for example a disease against wt\ich the drug is meant to act, or may lie within genomic regions 
which are not responsible for or partly responsible for a disease. For example, the genomic region harboring markers 
associated with a particular drug response may carry a drug metabolism gene, or a gene encoding a protein with a role in 
the drug response medianism. Thus, btaOelic markers within or near genes known to be involved in drug response, 
toxidty, or ellicacY or geoes suspected of being involved in drug response, toxidty, or efficacy may be used to identify 
individuab (aiely to respond positively or negatively to drug tieatmant In the context oi the present inverttion. a "positive 
response' to a medicament can be defmed as comprising a reduction of the symptoms related to the disease or condition 
to be treated. In the context of the present inyentioa a 'negative response' to a medicament can be defined as 
comprising either a lack of positive response to the medicament which does not lead to a symptom reduction or to a 
side-effect observed f diowing adnunistration of the medicament 

Drug efficacy, response and tderanceltoxtdty can be considered as multifactorial traits involvirtg a genetic 
component in the same way as complex diseases such as Alzheimer's disease, prostate cancer, hypertension or diabetes. 
As such, the Identiftcation of genes invohred in drug efficacy and toxidty could be achieved following a positional doning 
approach, e*g. performing linkage analysis within fami&es in order to obtain the subchromosomal location of the gene(s). 
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However, this type of analysis is actually impractical in the case of drug responsiveness, due to the lock cf availability of 
familial cases. In fact, tha liketihood of having more than one individual in a particular family being exposed to the same 
drug at the same time is very low. Therefore, drug efficacy and toiicity can only be analyzed as sporadic traits. 

In order to conduct association studias to analyze the individual rosponsc tn a givon drug In groups of patients 
affected with a disease, up to four groups arc sacencd to determino their patterns of biallolic markers using the 
tcchnk)ues describod above. The four groups arc: 
' Non*diseasod or random controls, 
' Oisoascd paticnis/druQ rcsponders, 
- Diseased patients/drug non-rosponders, 
• Disaasod patients/drug side effects. 

In preferred embodiments, the above mentioned groups are recruited accarding to phenotyping ciitcria having 
the characteristics described above, so that Uie phcnotypcs defining the different groups are non-ovorlapping, preferably 
extrenu! phenotypes. 

In highly preferred embodiments, such phenotyping criteria have the bimodal distribution doscribed above. 
The final number and compjosition of the groups for each drug association study is adapted 
to tlic distribution of the above described pheaotypes witliin tlie studied population. 

After selecting a suitable population, association and haplotypc analyses may be pcrfGrmcd as 
described herein to identify one or more biallelic markers associated with drug response, preferably drug toxicity or drug 
efficacy. The identification of such one or more biallcGc markers anows one to conduct diagnostic tests to determine 
whether the administration of a drug to an individual wHI result in drug response, preferably drug toxicity, or drug 
efficacy. 

The methods described above for identifying a gene associated with prostate cancer and biallelic markers 
indicative of a risk of suffering from asthma may bo utilized to identify genes associated with other detectable 
phenotypes. In particular, tha above methods may be used with any marker or combination of maikers included in the 
maps of the present invention, tndudmg the 653 biallelic markers obtained above (which include the sequences of SEQ 
10 Nos. V50 and 5M00 or the sequences complementary thereto), the PG1 markers, the asthma-associated markers, 
and tha Apo E markers of SEQ ID Nos. 30V30S/307<31 1 or the sequences complementary thereto. As described above, 
the general strategy to perform the association studies using the maps and markers of the present invention is to scan 
two groups of indhriduals (trait positive indMt^als and trait negative controls) characterized by a well defined phonotype 
in order to measure the allele frequencies of the biallelic markers in each of these groups. Preferably, tho f roqucncics of 
markers with ntar-marker spacing of about 150 kb are detennined in each groups. More preferably, the frequencies of 
markers with inter-marker spacing of about 75 kb are determined in each group. Even more preferably, markers with 
inter-marker spadng of about SO kb, about 37.5kb, about 30kb« or about 25kb will be tested in each population. For 
genome-wide studies, it wDI be preferred to measure the frmuancies of about 20,0OQ, or about 40,000 biallelic markers 
in each group. bi a highly prefmd emboi&nent, the frequencies of about 60,000, about 80,000. about 100,000, or 
about 120.000 biatleCc markers are detennined in each group. In some embodiments, haolotvpe analyses mav be nm 
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using groups of markers located within regions spanning less than Ub, from 1 to 5kb, from 5 to IQkb, from 10 to 25kb. 
from 25 to 50kb, from 50 to 1 50kb. from 1 50 to 250kb, from 250 to 500kb, from SOOkb to 1 Mb, or more than 1 Mb. 

Allele frequency can be measured using microsequencing techniques described herein* preferred iiigh 
throughput microsoquoncing procedures are further exemplified below; it will be further apprcdated that any otlior largo 
scalo gonotyping method suitabb with the intended purpose contemplatod herein may also be used. 

In some cmbadimcnts of the present invention a computer-based system may support the on-line coonJinalimi 
between the identification of biallclic markers and the corresponding analysis of their Irequcncy in tho different yruups. 

U will be appreciated that il is not necessary tu use a full lti[)li density biafletic marker map In order to s(;irt a 
genome-wide association study. It is suf ficcnt to generate and use a first set of about 20,000 markers (ens marker per 
B/VC, average tnter-maiker spacing of about 150kb). Maps having higticr densities of biatlolic markers ttwo or moiu 
markers per BAC. average inter-marker spacing of about 75kb or less) may then be generated by starting first oji those 
DACs for which a candidate association has been established at tlic first step. 

In cases wtu2n one or more candidate regions have previously been dolinoatcd, such as coses where a particular 
gena or genomic region is suspected of being associated with a trait, local excerpts of biaDetic marker maps having 
densities above one marker per 1 SOkb may be exploited using BACs harboring said genomic regions, or genes, or portions 
thereof, tn these coses also, successive association studies may be performed using sets of biaUcKc markers sf lowing 
increasing densities, prifarably from about one every 150 kb to about one every 75kb; more preferably, sets of markers 
with inter-marker spacing below about SOkb, below about 37.5kb, below about 30kb. most preferably below about 25 
kb, will be used. 

Haplotype analyses may also bo conducted using groups of biallelic markers within the candidate region. The 
bialloiic markers.includcd in each of these groups may be located within a genomic region spanning loss than Ikb, from 1 
to 5kb, from 5 to lOkb, from 10 to 25kb. from 25 to SOkb, from 50 to ISOkb, from 150 to 250kb, from 250 to SOOkb, 
from SOQkb to 1Mb, or more than 1Mb. )t wHt be appreciated that the ordered ONA fragments containing these groups of 
biaUelic markers need not completely cover the genomic regions of thoso lengths but may instead be incomplete contigs 
having one or more gaps theritn. As discussed in further detail below, biallelic markers may be used m association studies 
and h^otype analyses regardless of the completeness of the coaesponding physical contig harboring them, provided linkage 
disequifibnum between the markers can be assessel 

As described above, if a posithre assoctation with a trait, such as a disease, or a drug efficacy and/or toxicity, 
is identified using the biafteiic markers and maps of tho present invention, the maps will provide not only the 
confirmation of the associatiorv but also a shortcut towards the identification of the gene involved in the trait under 
study. As desaifaed above, since the markers showing positive association to the trait are in linkage disequiKbriura with 
the trait loci, the causal gene wOl be physically located in the vicinity of these markers. Regions identified through 
assoctation studies using high density maps will on average have a 20 - 40 times shorter liengih than those identified by 
linkage analysis (2 to 20 Mb). 

As descnlied above, once a positive association ts confimied whh the high density fataUefic marker maps of the 
oresent invention, BACs from which the most htghly associated markers were derived are comptetelv sequenced and the 
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mutations in tho causal gene are searched by applying genontic analysis tools. As dcscribod above, once a region 
harboring a gene associated with a detectable trait has been sequenced end analyzed, the candidate functional regions 
|e.o. exons and splice sites, promoters and other regulatory regions) arc scanned for mutations by comparing the 
sequences of a solcctcd number of controls and cases, using adequate software. 
5 In some embodiments, trait positive samptos being compared to identify causal mutations arc selected among 

those carrying the ancestral haplotypc; in these embodiments, coniiol samples arc chosen from individuals not carrying 
said ancestral haplotype. 

In further cmbodiincnts, (rait positive samples being compared to identify causal nnitaiions are solcctcd ninong 
those showing hapto types that are as close as possible to the ancestral haplotypc; in these embodiments, control 

10 samples ore chosen from individuals not carrying any of the haptotypcs selected for the case population. 

Tlie mutation detection procedure is essentially similor to that used for btallalic site idcntilication. A pair of 
oligonucleotide primers are designed in order to amplify the sequences to be tested. In preferred ombodimonts, priority is 
given to the testing of functional sequences; In such embodiments, sequences coveting every eitoniproniotur predicted 
region, preferably including potential splice sites, are determined and compared between the and T- populations. 

15 Amplification is carried out on DNA samples from and T- individuals using the polymerase chain reaction under the 

above described conditions. To be sequenced, amplification products from genomic PGR may be subjected to automated 
dideoxy terminator sequencing reactions and electro phorcsed on A61 377 sequencers. Following gel image analysis and 
ONA sequence extraction, A61 sequence data are automatically analyzed to delect the presence of sequence variations 
among T-t- and T- individuals. Sequences are preferably verified by comparing the sequences of both DNA strands of 

20 each individual 

It is preferred that candidate polymorphisms be tlien verified by screening a larger population of cases and 
controls by means of any genotyping procedure such as those described herein, preferably using a micioscqucncing 
technique in an individual test format. Polymorphisms are considered as candidate mutations when present in cases and 
controls at frequencies compatible with the expected association results. 

25 The maps and biaDeltc markers of the present invention may also be used to identify patterns of btallalic 

markers associated wtth detectable traits resulting from polygenic interactions. The analysis of genetic interaction 
between aBeles at uriGnketl led rtqtxres tndhnduai genotyping using the techivques described herein. The anaW^is of 
allelic interaction among a setetted set of biaHeBc markers with appropriate p-values can be considered as a haplotype 
analysis, similar to those described id further details within the present invention. 

30 

Use of Biatlelic Markers to Identify Individuals lilcelv to Exhibit a Detectable 
Trait Associated with a Particular Allele of a Known Gene 
In addition to their utility in searches for genes associated with detectable traits on a genome-wide, chromosome- 
wide, or subchromosomal level, tha maps and biallelic markers of the present tnventum may be used in more targeted 
35 approaches for identifying indhmkials likaly to ezhibit a particular detectable trait or imfividuals who exhibit a particular 
detectable trait as s consequence of possessina a particular allela of a gene associated with the detectable uaiL For 
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eiamptc. the bialleJic markor^ and maps of the present invention may ba used to identify individuals who carry an allele of a 
known gene that is suspected of beinp associated with a particular detectable trait, tn particular, the target ocnes may ba 
genes having alleles which predispose an individual to suffer from a specific disease state tn other cases, the target goncs 
may be Qcncs having alleles that predispose an individual to exhibit a desired or undcsircd response to a drug or other 
pharmaceutical composttton, a food, or any administorcd compound. Tlie known gene may eticudc any of a variety of typos 
of biomolcculcs. For example, the known gones targeted in such analyzes may be genes known to be involved in a pni tinular 
step in 8 metabolic pathway in whtdi disruptions niay cause a detectable trait Allcrnatlvcly^ the target genes may be genes 
encoding receptors or Ggands which bind to receptors in wliich disiuptiuns may causa a detectable trait, genes encoding 
transporters, genas encoding proteins witit signaling activities, genes encoding proteins involvctJ in the immune rcsjiuusi*. 
genes encoding proteins tnvolvad in hcmatopoesis, or genes encoding proteins involved in wound heating. It will be 
appreciated that the target genes are not limited to those specifically enumerated above, but may be any gone known to 
be or suspected of being associated with a detectable trait. 

As previously mentioned, the meps and markers of tlic piesent invention may be used to identify genes 
associated with drug response. Accordingly, the present invention comprises a method of using a drug coniprising 
obtaining a nucleic acid sample from an individual, datermining the identity of the polymorphic base of one or more 
biallelic markers obtained by the mothods doscribed above whidi is or are associated with a positive response to 
treatment with the drug or one or more bialleSc markers obtained by the methods described above which is or are 
associated with a negative response to treatment with the drug, and administering the drug to the individual if the 
nucleic acid sample contains one or more allolcs of biallclic markers associated with a positive response to treatment 
with the drug or if said nucleic add sample tacks one or mors alleles of biallclic markers assodatod with a negative 
response to tlic drug. In some embodiments of the method, the administering step comprises administering the drug to 
the individual if the nucleic add mipio contains one or more alleles of biallelic markers assodatcd with a positive 
response to treatment with the drug and the nudeic acid sample lacks one or more alleles of biallclic markers assodated 
with a negative response to the drug. 

The biallelic markers of the present invention may also be used to sdect individuals for inclusion in 
the dinical triab of a drug. By selecting individuals who are likely to lospond favorably to a drug for inclusion in the 
trial, the effectiveness of the drug can be assessed without lowering the measured effectiveness as a result of including 
non-responden or negative rasponders in the cCnical trial May be more importantly, using such selection may avoid 
induding patients who may suffer from undesirable skle effects if administered the drug under trial, thus increasing the 
safety of dinical trials. Accordingly, the present invention also includes a method of selecting an individual for inclusion 
in a dinical trial of a drug comprising obtaining a nucleic add sample from an mdividual. determining the identity of the 
pdyinorphic base of one or more biaUelic markers obtained by the methods described abovo which is or are associated 
with a posithre response to treatment with ttie (kug or one or more biallelic markers assodated with a negative response 
to uaatment with the drug in the nuddc add sample, and induding the individual in the dinical trial if the nudeic add 
sample contains one or more aOeles of biallelic markers obtained by the methods described above which is or are 
assodated with a positive response to treetment with said drug or if the nudeic add samota lacks one or more alleles of 
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bialleiic markers assodated with a negative response to the drug. In one embodiment of the method, tho inclusiun step 
comprises including the individual in thi clinical trial if the nucleic add sample contains one or more alleles o( bialletic 
marJcors associated with a positive response to treatment with ttic drug and the nudcic add sample lacks one or more 
alleles of tiialUlic markers assodatcd with a negative rcsponso to tho drug. 

5 In particular cmliodimonts, one or several of the ApoE linked mgrkors of SEO 10 Nos 301-305/307-3tl or the 

scqusncos complementary thereto may be used in targeted approaches to identify individuals who arc likely to develop 
Atzhoimcr's disease, or to identify individuah wlio do suffer from Alzlnamcr's diseaso. In odicr embodiments, uuu or mora of 
tlic markers of S£Q ID Nos. 306 and 312 and ono or more of tlte the ApoE linked markers of SEQ 10 Mas 301-305/30y-31 1 
or the sequences complementary thereto aro genotyped approadiss to identify individuals wliu arc likely to ilevelup 

10 Altheimor's disease, or to identify individuals who do sutler from Abheimcf's disease. In furtlicr embodiments, unu or several 
of the PG1 linked markers may be tested in targeted approaches to identify individuals who are likely to develop prostate 
cancer, or to identify individuats who do suffer from prostate cancer. Finally individuals likely to be asttunatic, or asthmatic 
individuals, can be idontiHcd using one or more of the asthma-associated markers to conduct the procedures of the present 
invention. 

^5 Given the high number of cancer types in which tho PGl chromosomal region is involved, it will be appreciated thai 

the PGl markers may bo employed to idtuitify individuals at risk of developing cancers other than prostate cancer, or to 
identify individuals suffertng from cancers other than prostate cancer. It will be funher apprcdatcd that the asthma 
associated markers may be tested to idenlify individuals Kkety to exhibit or exhibiting, inltammatory traits other than the 
asthmatic state (e.g. arthritis, or psoriasis, among others}. The present invention provides adequate methods to ostablish 
20 assodations between markers, such as those mentuncd above and candidate traits expressly contemplated hoicin, thus 

legitimating the corresponding targeted approaches to identify indhriduals fikcly to exhibit, or exhibiting said candidate traits. 

In some embodiments, the 6S3 biaUdic markers obtained above (which include the sequences of SEQ ID Nos. 
V50 and 51-100 or the sequences complementary thereto) may be used in targeted approaches to identify individuals at 
risk of developing a detectable trait tor example a complex disease or desired/undesircd drug response, or to identify 
25 imrnrtduats exhibiting said trait The present invention provides methods to establish putative assodations between any of 

the biaHelic marksrs described herein and any detectable trdts, Induding those spedf ically described herein. 

To use the maps and markers til the preset invention m further targeted approaches, biadeltc markers which are 
in inkage dtsequflibrium with any of the above disdosed markers may be identified. In cases where one or more bialle&c 
markers of the present invention have been shown to be associated with a detectable trait, more biallefic markers in linkage 
30 disequil'dirium with said assodated bialleiic markers may be generated and used to perform targeted approaches aim'mg at 
identifying individuals exhibiting, or Ekely to exhibit said detectable trait according to die methods provided herein. 

Funhermpre, in cases where a canifidate gene is suspected of bdng associated with a particular detectable trait or 
suspected of causing the detectable trdt, biaUdic markers in linkage disequilibrium with said candidate gene may be 
identified and used in targeted approaches, such as the approaches utiGzed above for the asthma-associated gene and the 
35 ApoEgm. 
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Btalleiic markers that are in Gnkage (fisequilibnum with markers associated with a detectable trait or with genes 
assodated with a detectable trait, or suspected of being so, are identifiod by performing sinQtc marker anaJyzcs, hnplotype 
associatioii analyzes, or iinksQe discqiiGtmum measurements on samples from trait positive and trait negative individuals as 
dcsaibcd above using biallclic markers lying in tic vicinity of the target marker or gene. In this manner, a single biailalic 
5 marker or a group of biallclic markers may bo identified which indicate tliat an individual is likely to possess the detectable 
trait or does possess the detectable uait as a consequence of a particular allele of the target marker or gunc. 

Muctcic acid samples from tnilividuals to be tested for predisposition to a detectable trait or possession of a 
detectable trait as a consequence of a particular allele of the target gene may bo examined using the diagnostic methods 
described below. 

10 Diannostic Metltods 

To use tho maps and biallclic markers of the present invention to diagnose whether an cndhriduai is predisposed to 
express a detectable trait or whether tlio indhridual expresses a dctoctablo trait as a result of a particular mutation, one or 
more bisilelic markers indicative of such a predisposition or causatlvo mutation arc identified by performing assodation 
studies and haplotype analysis on affected and norvaff ected individuals as described above. 

15 Tfie diagnostic techniques of the present invention may employ a variety of methodologies to determine 

whether a test subject has a blalleltc marker pattern associated with an increased risk of dov eloping a detectable trait or 
whether the individual suffers from a detectable Uait as a result of a particular mutation, including methods which 
enable the analysis of individual chromosomes for haplotypina such as family studies, single sperm UNA analysis or 
somatic hybrids. 

20 Tho trait analyzed usmg the present diagnostics may be any detectable (rait, including diseases, drug response, 

drug efficacy, or drug toxicity. A 'positive' drug response may refer to a response indicating either some drug efficacy 
or no drug toxicity. Diagnostics which analyze dmg response, drug elficacy, or drug toxicity may be used to dotermine 
whether an individual should be treated with a particular drug. For example, if the diagnostic indicates a likelihood that 
an individual wHI respond positively to treatment with a particular drug, the drug may be administered to the individual 
25 Conversely, if the diagnostic indicates that en intfividual is likely to respond negatively to treatment with a particular 

drug, an altemathre course of treatment may be prescn*faed. A negative response may be defined as either the absence 
of an efficacious response or the presence of toxic side effects. 

Clinical drug trials represent another appGcatlon for the maps and markers of the present invention. One or 
more markers indicative of drug response, drug efficacy, or drug toxicity may be identified using tho techniques 

30 described above. Thereafter, potential partldpants in clinical trials of the drug may be screened to identify those 
individuals most likely to respond favorably to the dnig and exclude those likely to experience side effects. In that way, 
the effectiveness of drug treatment may be measured in individuals who respond positively to the drug, without lowering 
the measurement as a result of the tnduston of tndiridoals who are unlikely to respond postively in the study and 
without risking undesirable safety pi oblems. 

35 In each of the diagnostic methods, i nucleic acid sample b obtained from the test subject and the biaUelic 

marker pattern for tuie or more of the biellefic markers included in the maps of the present mvention, inchidino the 653 
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bialleUc markers obtained above (which include tha sequences of S£Q ID Nos* 1*50 and 5M00 or tho sequences 
complememary thereto), ths asthma assxiated btalledc markers, the PG1 biatlolic markers, and the Apo E biallclic 
markers, includino those of SEQ ID Nos. 301-305/307 311 or the sequences complomentary thereto. In other 
embodiments, the bialielic marker pattern of one or more of the markers of SCQ 10 Nos. 306 and 312 is [ietcnnined in 
addition to dotermlmno the biaOclic marker pattern of ono or more of the bialielic markers included in the maps of the 
present invention, including the 653 biallcltc markers obtained above (which include the scquoncos of SEQ ID Nos. I-SI) 
and 51-100 or the sequences complenientary tticreto), the asthma associatud bialielic markers, the P61 hiiillelic 
markets, and iha Apo £ biaUctic markers, including those of SEO ID Nos. 301-305/307-311 or the suqiifmccs 
camplemantarY thcrcta In soma ombodtmcnts, the bisltclic marker pattern is determined by conducting an amplification 
reaction to generate amplicons contamino the polymorphic bases of the one or more biaUcTtc markers to be ot^notyped. 
The identies of the polymorphic bases of the one or more bialelic markers to be analyzed may be determined using a 
variety of methods, including hybrhiization assays which spccilicatly detect ampQIication products containing particutar 
alleles of the one or more bialielic markers, and microsequancing reactions which identify the polymorphic bases of the 
one or more bialielic markers to be anlayzed. 

White the fallowing discussion utilizes the 653 biallclic markers obtained above (which include the soqucnccs 
of SEQ ID Nos. 1-50 and 5M00 or the sequences complementary thereto), the astluna*as$ociated biallclic markers, the 
PGl bialielic markers, and tho Apo E bialielic markers as examptcs of tha diagnostics of the present invention, it will be 
appreciated that the same diagnostics may be used in conjunction with any marker or any group of markers included in 
the maps of tha present Invention. 

Examples of ampiificatiDn primers enabling tho amplification, from subjects genomic ONA samples, of DMA 
fragments that carry each of the markers of SEQ ID Nos: 1-50 and 5M0O or the sequences complementary ihoroto. are 
oligonucleotides of SEQ 10 NOs: 10M50 and 151 -200; pairs of corresponding primers for a given bialielic marker may 
be reconstituted by choosing the adequate upstream oligonucleotide from SEQ 10 Nos. 10M50 together with the 
corresponding downstream ofigonucleotide from SEQ ID Nos: 151*200. 

SEQ ID Nos: 1-50 correspotul to the sequence idemification number f oi a first allele of the bialielic markers of 
SEQ ID Nos: 1*50 and 51-100 and SEQ 10 Nos: 5M00 correspond to the sequence identification number for a second 
allele of the biaMc markers of SEQ ID Nos: 1*50 and 5MQQ. 

SEQ ID Nos: 313*318 corrrespond to sequence identification numbers of upstraam amplification primers 
that may be used to generate amplification products containing the polymorphic bases of the bialielic markers of 
respective SEQ ID Nos: 301 -306/307*31 2. SEQ ID Nos: 319-324 correspond to downstream amplification primers that 
may be used to generate amplificatian products eantatning the polymorphic bases of the bialielic markers of respactivo 
SEQ ID Nos: 30V305(307-312. 

For aS markers of SEQ ID Nos: 1*50/51-100 and 301-306/307-312 or the sequences complementary thereto, 
the enclosed Kstings indicate the positxtn end identity of tha polymorphic base in each bialielic marker. Potential 
micfosequencing primers are also included in tha sequence listing. Tha sequences of SEQ ID Nos. 201-250 may be used 
in mkrosequencing procedures such as those described herein to datftonine the sequence of the Dotymorohic bases of the 
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biallelic markers of SEQ ID Nos. 1-50/5MOO, The sequences of SEQ 10 Nos. 325-330 or 331-336 may be used in 
mtcrosequencing procedures such as those deicribcil herein to determine the sequence of the polymorphic bases of the 
biayolic markers of SEQ ID Nos. 301 -306/307-312. 

All listings indicate tiic internal idmtificalion number corresponding to the biallcbc marker to which the listed soquence 
is related to. 

One aspect of the present invention is a method (or determining whether an individual is at lisk uf duvcloping 
Alzheimer's Disease or whether an individual suffers from Alihcimcr's Disease as a consequuiice of possessino the Apo E 
€4 site A allele. The method invoWes obtaining a nucleic acid sample from the individual and determining whether the 
nucleic acid sample contains one or more markers indicative of a risk of dcvGloping Alzheimer's Disease or one or more 
markers indicative that the individual suffers from Alzlieimcr's Disease as a result of possessing the Apo E £4 site A 
allele. In one embodiment, the method comprises determining the Identity of the polymorpluc base of one or more 
biallelic markers selected from tlic group consisting of SEQ ID Nos. 301-305/307-312 or lite sequences complomentary 
thereto in the nucleic acid sample. In a further embodiment, the method involves determining whether the nucleic acid 
sample contains the sequence of SEQ ID No. 306 (the C allele of marker 99-2452i54 containing the Apo E e4 site A 
allele) or the sequence complementary thereto. In a further embodimont tho mctJiad comprises determining whether the 
nucleic acid sample contains SEQ ID No. 311 {the T allele of marker 99-365/344) or the soquence complcmcntarY 
thereto. In another embodiment, the method comprises determining whether the nucleic acid sample contains SEQ ID 
No. 31 1 (the T allele of marker 99-385/344) and SEQ ID No. 308 (the C allele of marker 99-2452/54 containing tho Apo 
E site A allele) or the sequence complementary thoreto. 

In still a further embodiment, the method compHses determining whether the nucloic acid sanpic contains SEQ 
ID No. 302, 301, 303, and 304 or the sequences complementary thereto. In still a further embodiment, the method 
comprises determining whether the nucleic add sanple contains SEQ ID Nos. 302, 303, and 304 or the sequences 
complementary thereto. In a further embodiment the method comprises determining whether the nucleic acid sample 
contains SEQ 10 No. 31 1 (the T allele of marker 99-385/344) or the sequence complementary thereto. 

In some embodhnents, the step of determining the identity of tho polymorphic base of one or more bialleric 
markers selactod from the group consisting of SEQ 10 Nos. 301*305 and SEQ ID Nos. 307-311 or the sequences 
complefnentary thereto in the nucleic acid sample comprises conducting an ampGfication reaction on said nucleic acid 
sample using one or more of the amplification primers selected from the group consisting of SEQ ID Nos. 313-317 and 
SEQ 10 Nos. 319-323 and determining the identity of the polymorphic base in said one or more biallelic markers. 

In soma embodiments, tho tdentity of the polymorphic base may be determined us'mg one or more of the 
miaosequencing primers Usted as SEQ ID Nos. 325-329 or 331-335. In embodiments comprising the step of 
determining whether the nucleic acid sample contains the sequence of SEQ ID No. 306, the method may comptise 
conducting an ampfification reaction on the micleic add sample osing the pair of ampGflcation primers consiting of SEQ 
ID Nos. 318 and 324. In some embotfonints* tho step of detormtn'mg whether the nudeic add sample contains the 
sequence of SEQ ID 306 comprises conducting a microsequendng reaction using one of the miaasequendng primers 
listed as SEQ 10 Nos. 330 or 336. 
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Another aspect of the present invention relates to a method of determining whether an individual is at risk of 
developing a trait or whether an individual expresses a trait as a consequence of posscssino a particular trait-causing 
allele. Alternatively, another aspect of the present invention relates to a method of dctcrminino whether an individual is 
at risk of dcvetoping a plurality of traits or whether an individual expresses a plurality of traits as a result of possessing 
5 particular trait-causing alleles. Those mothods involve otitaining 3 nucleic acid sample from the individual and 

dctcmuntng whether the nucleic acid sample contains one or more markers indicative of a risk of developing tim trait or 
one or more markers indicative that the uidividual expresses the trait as a result of pussussing a particular trait- causing 
allele. In one embodiment tho methods comprise determining the identity of the polymorphic base of one ur mure 
biallelk markors in the maps of the present invention, including any of tho 653 biailelic markers obtained above (which 
10 include tiie sequences of SEQ ID Kos. 1-50 ind 5 MOO or the sequences complementary thereto), the asthma associatad 
biallelic markers, the PG1 bialtcfic markers* and the new Apo E bialletic markers. In a further embodiment, the methods 
comprise determining the identities of tits polymorphic bases of at least two, at least three, at least five, at least eight, 
at least 20. at least 100, at least 2Q0, at least 300, at least 400. between 400 and 2,000, between 2,000 and 4,000, 
between 4,000 and 10,000, between 10,000 and 20,000 or more than 20,000 of the biallelic markers in the maps of 
15 the present invention, including any of the 653 biallelic markers obtainod above (which include the soquonccs of SGQ lU 

Nqs. 1-50 and 5 MOO or the sequences complememary thcrctoh the astluna- associated biallelic markers, ilie PG1 
biallelic markers, and the new Apo E biallelic markers. 

In some embodiments, the step of determining the idontity of the polymorphic base of one or more biallelic 
markers in the maps of the present invention, including any of the 653 biallelic markers obtained above (which include 
20 the sequences of SEQ 10 Nos. V50 and 5M00 or the sequences complemontary thcicto), the asthma-associated 
bialtolic markers, the P61 bialleGc markers, and the new Apo E biallelic markers, comprises conducting an amplification 
reaction on said nudeic acid sample using appropriate amplification primers and determining tho idontity of the 
polymorphic base in said one or more biaUelic markers. In some embod'unents, the identity of the polymorphic base may 
be determined using appropriate microsequincing prkners. 
25 As desoibed herein, the diagnostics may be based on a single biallelic marker or a group of biallelic markers. 

Without wishmg to be Kmited to any particular value, it is preferred that the biallelic marker used in single marker 
diagnostics either as a positive basis for further diagnostic tests or as a prelim'mary starting point for early preventive 
therapy, exhibtt a p value in prefinunary screening association analyzes of about 1x10*^ or less. More preferably the p 
value is about 1 x 10** or less. 

30 Similarly, without wishing to be fimited to any particular value for diagnostics based on more than one biallelic 

marker, it is preferred that the baplotype exhibit a p value of 1x10^ or less, still more preferably 1 x 10^ or less and 
most preferably of about 1 x 10'* or less in a preliminary screening haplotype analysis. These values are bc&evcd to be 
applicable to any association studies involving stngle or multiple marker combinations. Significance thresholds may be 
refined according to the methods previously described. 

35 Example 32 describes methods for determining the bioUelic marker pattern in a nucleic acid sample. 
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Example 32 

A nucleic acid sample is obtained from an Individual to ba tested for suscepttbility to a detectable trait or lor a 
detectatile trait caused by a particular mutation. The nucleic acid sample may be a RNA sample or a DMA sample. 

A PGR amplification is conducted using primer pairs whicli generate amplification products containiny the 
paiymerphic nuctcotidas of one nr more bialldic markers associated with such a predisposition or causative mutation. 
For example, the amplification products may contain the polymorphic bases of uic or more of the biallelic markers in the 
maps of Iha present kiventioa (ndudinp any of the 653 biallelic markers obtained atiove (wIticJi incfude Itiu sequences of 
SEO ID Nos. 1-5Q and 5V100 or the sequences camptemontary thereto), the asthma-assaciated biallelic markers, the 
P61 biallelic markers, and the Apo £ biallelic markers or binlleltc markers in linkage discquilibriuni with any tif these 
bialtciic markers. In some embodiments, the PGR amplication is conductad using primer pairs which genorate 
amptiiication products contaimno the potyniorphie nucleotides of several biallctic markers. For example, in one 
embodiment, amplification products containinQ the polymorphic bases of one or morft biaUc markers in the maps of the 
present invention, indudino any of the 653 biallelic markers obtained above (which include tlic sequences of SEQ IQ 
Nos. 1-50 and 5M0Q or the sequences complemoniary thereto), the asthma-associated biailciic markers, the PGl 
biatlelic markers, and the Apo E biallelic markors, biallelic markers which are in linkaoe disequilibrium thorewitli or with a 
causative mutation associated witb a detectable phanotype may be generated. In another embodimcnl amplification 
products containino the polymorphic bases of five or more biallelic markers in ttic maps of Uic present invention^ 
including any of the the 653 biallelic markers obtained above (which include the sequences of SEQ 10 Nos. 1-5Q and 51* 
100 or the sequences complementary thereto), the asthma-associated biatlelic markers, the PG1 biallelic markers, and 
the Apo E biallelic markers, biaUoiic markers which arc in Gnkage discquilibriuni thcniwvith or with a causative mutation 
associated with a detectable phcnotype may be generated. In another embodiment, amplification products containing the 
potymorphic bases of 20 or mare biallelic markers in the maps of the present invention, indudino any of the 653 biallelic 
markers obtained above (which include the sequoices of SEO ID Nos. Y-50 and 5 MOO or the sequences complementary 
thereto], the asthm^assodated bialleic markers, the PG1 faialblic markers, and the Apo E biallelic markers, biallelic 
markers which are in Unkags disequilibrium therewith or vuith the causative mutation may be generated. In another 
embodiment, ampCfication products containino the polymorphic bases of ^ 00 ot mote bialletic markers in the maps of the 
present invention, itidudinQ any of the the 653 bialleGc markers obtained above (which tncUide the sequences of SEQ ID 
Nos. 1-50 and 5MQ0 or the set^uences complementary thereto), the asthmaassociated btalleKc markers, the PGl 
bia&elic markers, and the Apo E bialleGc markers, biallefic markers which are in linkage diseqiAlibrium therewith or with a 
causative mutation associated with a detectable phenotype may be Generated. In another embodiment* amplification 
products containino the potymorphic bases of 200 or more biallelic markors in the maps of the present invomion, 
mchiding any of the the 653 bianelic markers obtained above (which include the saqueiices of SEQ 10 Nos. 1 -50 and 51 • 
100 or the sequences complementary thereto), the asthma-assoctatod biatleTtc markers, the PGl biallefic markers, and 
the Apo E blaUefic markers, biallelic markers which ere in linkage disequSibrhim therewith or with a causathre mutation 
associated with a detectable phenotype may be generated. In another embodunent ampEftcatkin products containing the 
poiymorphic bases of 300 or more biaDelic markers in the maps of the present invention, indudino anv of the 653 
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biatlelic markors obtained abova (which induds the sequences of SEQ tO Nos. 1-50 and 5M00 or the sequences 
complementary thereto), (ha asthma-associated biddclic markers, the PCI biatlelic markers, and the Apo £ biallGlic 
markors, biandic markers which are in linkage disequilibrium therewith or with the causative mutation may be 
generated. In another cmhadiment amplification products containing the polymorphic bases of 400 or more biallGlic 
5 markers in the mops of the present invention, indudifiQ any of the the CSS biallelic markers obtainuti above (which 

include tho sequences of SEQ ID Nos. 1-50 and 51-100 or the sequences complementary tlicrcto), the asthmo associatcd 
biatlelic markers, the PG1 btaOeb'c markers, and the Apo E biallelic markers, biallelic markers wlucli are in linkaot-* 
disequilibrium therewith or with a causaitve mutation associated with a detectable phenotype mey be tjcncrated. 

The primers used to generate the amplification products may be designed as doscribcd herein. Representative 

10 ampHfication primers lor gcncratmo amplification products containino the polymorphic bases of the biallelic markers of 

SEQ ID Nos. 1*50 and 5M0O are provided as SEQ ID Nos. 10M50/15 1-200 in the accompanying Scquenco Listinu. 
The pen primers may bo oligonucteotides of 10. 15, 20 or more bases in length which enable the amplification of the 
polymorphic site in the markers. In some embodiments, the amplification product produced using these primers may be 
at least 100 bases in length (i.e. about 50 nucleotides on each side of the polymorphic base). In other embodiments, the 

15 amplification product produced vrsing these primers may be at least 500 bases in length (i.e. about 250 nucleotides on 

each side of the polymorphic base). In still funhir embodiments, the amplification product produced using these primers 
may be at least 1000 bases in length (Is. about 500 nucleotides on each side of the polymorphic base). 

Table 9 lists the internal identiftcation numbers of the 50 localized markers described herein and tlu Apo E 
markers described herein, the SEO ID Nos. for each of ttie two alleles of tlicse bialkilic markers, the SEQ ID Nos. of 

20 representative upstream and downstream amplification primon which can bo used to Qcnerate amplification products 

including the polymoiphic bases of these biallelic markers, and the SEQ 10 Nos of microsequcncing primers which can be 
used to determine the identias of the polymorphic bases of these markers. 

Table 10 

Marker SEQlOUos SEQ ID Nos SEQ 10 Nos 

25 (Gensetcode) First SBContf AmplificaUon primers Microsequonctng primers 

allele allele Upstream Downsueam 1 2 
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99-2251 10 

99-2269 11 
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99-2647 4S 99 149 1 99 249 299 

99-2649 50 lOO 150 200 250 300 

)t will be appreciated that the primers listed in Tabic 9 arc merely oxcmplarv and that any other sot of primers 
which produce amplification products containing the polymorphic nucleotidos of one or mors of the biatlelic markers of 
5 SEQ ID Nos: 1-50 and 5M00 or biaSelic markers in linkage disequilibrium therewith or with a causative mutation for a 

detectable trait, or a combination thereof may be used in the diagnostic muitiuds. It will also be apprcciaicd that those 
diagnostic mctliods may be performed with any tiiallelic marker or combination of biallelic marktirs included in thu maps 
of ttiG present invention. 

Following tho PCH amplification, the identities of the polymorpfiic bnsos of one or more of the bialluiic markers 
10 in the nudeic acid sompU.* are determmod. The tdciitilics of the polymorphic bases may be determined usinn the 

microsoqucncing procedures described in Example 13. It win be appreciated that the mtcrosequencitiQ primers listed as 
SEQ ID NOs: 20V250 and 251-300 m merely exemplary and that any primer fiaving a 3' end near the polymorphic 
nucleotide, and preferably immediately adjacent to the polymorphic nucleotide^ may be used. Similarly, it will be 
appreciated that microscqucncino analysis may be performed for any marker or combination of markers in the maps of 
15 the present invention. 

Alternatively, tha microsequencing analysis may be performed as described in Pastincn et aL, Genome 
Research 7:606-614 (1997), the disclosure of which is incorporated herein by reference, and which is described in more 
detail below. 

Alternatively, the PGR product may be completety sequenced to determine the identities of the polymorpfiic 
20 bases in the biallolic markers. In another method, the identities of tlie polymorptiic bases in the biallclic markers arc 

determined by hybridizing the amplification products to microarrays containino allele specific olignonucleotides specific 
for the polymorphic bases in the biaQelic markers. The use of microarrays comprising allele spcdfic oGoonucteotidos is 
described in more detail below. 

It will be appreciated that the identities of the polymorphic bases in the biallolic markers may be dotcrmincd 
25 using techniques other than those listed above, such as conventional dot blot analyzes. 

Nuclotc acids used in the above diagnostic procedures may comprise at least 10 consecutive nucleotidos, 
including the polymorphic bases, of the biaDelic markers in tha maps of the present invention, including any of the 653 
biatlelic markers obtained above (which include the sequences of SEQ 10 Nos. 1*50 and 5M00 or the sequences 
complementary thereto}, the asthma-associated biaUefic markers, the PG1 biallcUc markers, and the new Apo £ biallelic 
30 maricers, including those of SEQ ID Nos. 301-305/307-311 or the sequences complementary thereto. Altornatively, the 
nucteic acids tued in tha above diagnostic procedures may comprise at least 15 consecutive nudootidcs, including the 
potymorphic tiases, of the biaUelic markers ta the maps of the present invention, including any of the 653 biallolic 
markers obtained above (which incbjde the sequences of SEQ 10 Nos, 1-50 and 5M00 or tha sequences complementary 
thereto), the asthma-associated biaUelic markers, the PG1 btaOelic markers, and the new Apo E biatlelic markers. 
35 including those of SEO ID Nos. 301-305/307-311 or the sequences complementary thereto. In some embodiments, the 
nudeic acids used in the above diagnasttc procedures may comprise at least 20 consecutive oudeotidas, including the 
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polymorphic bases, of the diadettc markers in the maps of tho present invention, including any of the 653 biailelic 
markers obtamed above (which includs the soqucnccs of SEQ ID Nas. V50 and 5 MOO or the soqucnccs complcniuntary 
thsreto), tho asthma- associated biandic markers, the PGl biollclic markers, and the new Apo E bioUuIic markers, 
tncludino those of SEQ 10 Nos, 301'305/307<31 1 or (he sequences complementary thereto. In still other cmbodimonts. 
the nucleic acids used in tJic above diagnostic proccdurss may comprise at least 30 consocutive nucleotides, includino 
(he polymorphic bases, of the biailelic markers in the maps of iUc present invuntjan, including any of tha 653 UintluUc 
markers obtained above (which include the sequences of SEQ 10 Nos. 1-50 and SMOG or the sequences complementary 
thereto^ the asthma-associatod biallctic markers, the PGl binlteVic markers, and the new Apo E biatloDc markers, 
including tfiosc of S£(l tO Nos. 301 •305/307-31 1 or tiie sequences complcmuritary thereto, lii further ombodijitcnts, the 
nucleic acids used in the above diagnostic procedures may compriso more tlian 30 consecutive nucleotides, including the 
polymorphic bases, of the biailelic markers in the maps of the present invention, including any of the the 653 biailelic 
markers obtained above (which include the sequences of SEQ 10 Nos. 1-50 and 5 MOO or the sequences complementary 
thcretol, the asthma-associated biailelic markeis, the PGl bialtolic markers, and the new Apo E bialtchc markers, 
including those of SEQ 10 Nos. 301-305/307-31 1 or the sequences cumplemontory thereto. In still further crnhodimcms, 
the nucleic acids used r the above diagnostic procedures may comprise the ojiitru sequence of the biaJhilic markers in 
the maps of the present invention, including any of the the 653 biatlolic markers obtained above (which include the 
sequences of SEQ 10 Nos. 1-50 and BMQO or the sequences complcmcmary thereto}, the asthma-associated biailelic 
markers, the PGl biailelic markers, and the new Apo E biatlclic markers, including those of SEQ ID Nos. 301-305(307' 
31 1 or the scqusncos complementary thereto. In some embodiments the nucleic acids used in the diagnostic procedures 
are longer than the sequences of SEQ 10 Nos. VBD, 5M00. 301-305 and 307-11 because they contain nuclcntides 
adjacent to these sequences. 

The diagnostics of the present invention may also employ nucleic acid arrays attached to DNA chips or any 
other suitable soGd support, including beads. As usod herein, the term array means a one dimensional, two dimensional, or 
multtdimensionat arrangement of a plurality of nucleic adds of sufficient lenQth to permit specific detection of nucleic acids 
capabte of hybntfetRg thereto. 

DNA chips allow the integration of micro-biochemical processes (such as DNA hybridtzatton), systems of signal 
detection (such as fborescencel and data processing into a single system which can be usod to obtain information on 
polymorphisiTL The solid surface of tlw chip is often made of silicon or glass but it can be a polymeric membrane. 
Effictant access to polymorphism information is obtained through a basic structure comprising high-density arrays of 
oligonucleotide probes attached to a solid support (the chip) at selected positions. The immobiliiatian of arrays of DMA 
probes on solid supports has been rendered possible by the development of a technology generally identified as 'Very 
Large Scale Immobilized Polymer Synthesis' IVLSIPS™) and in which, typically, probes are inimobiliied in a high density 
array on a solid surface of a chip. Examples of VLSIPS"^ technotogies ere provided in US Patents 5,143,854 and 
5,412,087 and m PCT Publrcatsons WO 90115070, WO 92/10092 and WO 95/11995, the disclosures of which are 
incorporated herein by reference, which describe methods for forming oltgonucleotide arrays through techniques such as 
fieht-dtrected synthasis technkiues. 
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In desigmoQ strategies aimed at providing arrays oi nucleotides immobilizod on solid supports, iurthsr 
presentation strategies wcrs dsvetoped to order ond display the probo arrays on tho chips in an attempt to maximize 
hybridization patterns and sequence infonnation. Examples of such presentation strategies arc disclosed in PCI 
Publications WO 94/12305, WO 94/11530. WO 97/20212 and WO 97/31250, the disclosures of which arc incorporated 
5 herein by reference. 

Each DMA chip can contain thousands (o millions of iiuJivtdual synthetic ONA probes arranged in u uriif Jikc 
pattorn and miniaturized to the me of a dime. 

The chip tBchnology has been succassfuUy used to detect mutations in numuruus coses. Fur example, tho 
screening of mutations has been undertaken in the BRCA1 gene, in S. CBTOvisiae mutant strains, and in tho pruteasc 
to geno of HlV-1 virus (see llacia et al., Nat Genet 14:441447(1996); Shoemaker c( aL Nat Genet 14:450-45(3 (1990: 

Kozal ct aL« /i/at Med. 2:753-759 |1996L the disclosures of which aro incorporated htirein by reference). At least three 
companies proposo chips ablo to detect biallefx polymorphisms: Affymatrix (GencChip). Hyseq (HyChip ond My Gnostics), 
and I'rotoQenc Laboratories. 

tn some anibodimoms, tbe efficiency of hybridization of nucleic acids in the sample with ttie probes attached to 
15 the chip may be improvud by using polyacrylamide gel pads isoiaiad from one another by hydrophobic regions in which 

the DMA probes are covalently linked to an acrylamide mauii. 

The polymorphic bases present in the biallclic marker or maikcrs of the sample nucleic acids arc dutcrmincd as 
follows. Probes which contain at teast a portion of one or more of the biaQelic markers of the present invention are 
synthesized cither /n situ or by conventional synthesis and immubinzed on an appropriate chip usiny methods known to 
20 the skilled technician. 

The nucleic acid sample which includes the candidate region to be analyzed is isolated, ampiifiod with primers 
capable of generating an amplification product containing the polymorphic bases of one or more biallclic markers, and 
labeled with a reporter group. The reporter group can be a fhjorescent group such as phycoerythrin. Tlic labeled nucleic 
acid is then incubated with the probes immobilized on the chip using a fiuidics station. For example, Manz et al. \A¥d, in 
25 Chfomtogr, 33:1-66 (1993). the disclosure of which is incorporated herein by reference) describe the fabrication of 

ftutdics devices and partlcufaxfy microcapillary devices* in silicon and glass substrates. 

After the reaction is compteted, the chip is inserted urto a scanner and patterns of hybridization are dctcctel 
The hyfaridizatton date b collected as a signal emined from the leporter groups already incorporated into the nucleic 
acids generated in the amplification of the sample \3HK which is now bound to the probes attached to the chip. Probes 
30 that perfectly match a sequence of the nucleic acid sample geniraUy produce stronger signals than tttose that have 
mismatches. Since the sequence and poshton of each probe immobirtzed on the chip is known, the identity of tho nucleic 
acid hybridized to a given probe can be determined. 

For single-nudeotido polymorphism analyzes, sets of four oligonucleotides are generally designed (one for each 
possible base) that span each position of a portion of the candidate region found in the nucleic acid sample, differing only 
35 in the tdentity of the central base. The ralathra intensity of hybridization to each series of probes at a particular location 
allows the tdenttflcation of the base corresponding to the central base of the probe. For example, to detect sinole 
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nucleotide polymorphisnu such as thoss in thi present biallelic markers, olioonucleotides having each a( the two allelic 
bases at their central position are affixed to tho chip. The amplification products result ing from amplification of the 
nucleic acids in Itic sample are hybriducd to the chip under high strinQcncy (at lower salt concontiation and higher 
temperature over shorter time periods) to facilitate specific dotcction of tho polymorphic sequonccs present in tho 

5 nucleic acid sample. 

The use of direct electric field control improves the daterminalion of singlo base mutations (Nuruiymt). A 
positive field increases the transport rate of ncyativety charged nucleic acids and results tn a 10*fold increase of the 
hybridization rates. Using this technique, dnglc base pair mismatches are detected in less than 16 sue Iscc Sosnowskt et 
al., Proc NotL AcstL ScL USA 94;1119*tt23 (1997K the disclosure of which is iticerporated herimi by reference). 

10 Another technique which can be used tu analyze polymorphisms includes multicomponcnt integrated systems 

which miniaturize and compartmentalize processes such as restriction enzyme digestion, PCR reactions, and capillary 
electrophoresis in a single functional device. An example of such technique is disclosed in US patent 5,589,136, the 
disclosure of which is incorporated herein by reference, which concerns the intCQration of PGR amplification and 
capillary electrophoresis in chips. Integrated systems are best applied with micro fluidic systems. These systems 

15 comprise a pattern oi miaochanncts designed onto a glass, silicon, quartz, or plastic wafer included on a microchip. The 

movements of the samples are controlled by electric forces applied across different areas of tho microchip to create 
functional microscopic valves and pumps with no moving parts. Regulating or varying the voltage controls the liquid flow 
at intorscctions between the micro-machined channels and changes the liquid fluw rate for pumping across different 
sections of the microchip. 

20 In the caso of biallelic marker analyzes, the micro-chip integrates nucleic acid amplification, a microsetiucncing 

reaction (such as the one described above), capillary electrophoresis and a detection method such as laser-induced 
fluorescence detection. 

In a first step, the DNA samples are ampliiied, preferably by PGR. Then the amplification products ere 
subjected to automated microsequancing reactions using ddNTPs (specific fluorescence for each ddNTP) and the 

25 appropriate oligonucleotide microsequencing primers which hybridize just upstream of the targeted polymorphic base. 

The microsequencing reactions may employ primers capable of being extended to the polymorphic bases of the biatleHc 
tnaikers. Preferably, the microsequencing prvners comprise a sequence terminating at the base immediately preceding 
the polymorphic base of the bialieKc markers. Once the extension at the 3' end is completed, tho primers are separated 
from the unincorporated fluorescent ddNTPs by capillary electrophoresis. The separation medium used in capillary 

30 electrophoresis can for oxampte be polyacrylamide, pofyethyleneglycol or dextran. The incorporatod ddNTPs in the sinQic- 
nucleotide primer extension products are identified by ihioresceiico dotection. Preferably, the micro-chip can be used to 
process at least 98 santples in parallel More preferably, the micro-chip can be used to process at least 384 samples in 
parallel. Preferably, the microchip is designed for use with detection proceduros using four color laser induced 
fluorescence detection of the ddNTPs. 
35 Any one or ntore alleles of the biaHefic markers in the maps of the present invention, or fragments ttoreof 
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containmg the polymorphic basos, may be fixed to a solid suppoa such as a microchip or other immobitizInQ surface. The 
fragmonts of thcso miclcic acids may comprise at least 10, at loast 15, at least 20. at least 25, or more than 25 
consecutive nucleotides of the biallalic markers describod hercirt Preferably, the fragments include the polymorphic bases of 
the biallciic markers. 

5 A nucloic add sample is applied to ttic tiifnobiliang surface and analyzed to dcterminG the iiJciittes uf the 

pulyinorptiic bases of one or more of the bialldic markers. In sonic cnibodimcnts, the solid suppoi t may also include one or 
more of the amplification primers described herein, or fragments comprising at least 10, at least 15, or at loast 20 
consecutive nudootides thereof, for generating an ainplificatian product containing tlw polymorpliic bases of the biatlclic 
markers to be analyzed in the sampkL 

10 Another embodiment of ttie present invention is a solid suppoa which includes one or more of tho micrusLM|uuncinu 

primers listod as in the accompying Sequence Listing, or fragmsms comprisinQ at least 10, at least 15, or at least 20 
consccutne nucleotides thereof and having o 3' tamitnus immediately upstream of the polymorphic base of the 
correspondino biallciic marker for determining the identity of the polymorphic base of the one or more bialleiic markers fizL^J 
to the solid support 

1 5 For example, one embodimont of the present iivenlian is an array of nucleic acids fixed to a solid suppoi t, such as 

a microcfup, bead, or other immobdizing surface, comprising one or more of the bialleiic markers in the maps of the present 
invention or a fragment comprising at loast 1 0, at least 1 5, at least 20, at least 25, or more than 25 consecutive nucleotides 
thereof including tho polymarphic base. For example, the array may comprise one or more of any of the 653 biallciic 
markers obtained above (which include the sequences of SEQ 10 Nos. 1-50 and 51-100), the asthma-associated biallciic 

20 markers, the PGl bialleiic markers, and the new Apo E biallciic markers (inckiding SEQ ID Nos. ao1-305/307-31 1) or the 

sequences complmientary thereto, or a fragment comprising at kiast tO, at least 15, ai least 20, at least 25. or more than 
25 consecutive nucleotides thereof including the polymorphic base. In a further embodiment, tho array comprises at least 
five of the biaQoGc markers m the maps of the present invention or a fragment comprising at least 10. at least 15, at least 
20, at least 25, or more than 25 consecutive nucleotides tliereof including the polymoiphic base. For example, the arrays 

25 may comprise at least five of any of the 653 bialleiic markers obtained above (which inchtdc the sequences of SEQ ID 

Nos. 1-50 and 5M00), the asthma-assodated bialleCc markers, the PG1 bialleiic markers, and the new Apo E bialleiic 
markers (includ'mg the sequences of SEQ ID Nos. 301-305/307-31 1) or the sequences complementary thereto, or a 
fragment comprising at least 10, at least 15, at least 20, at least 25. or more than 25 consectrtive nucleotides thereof 
tndudmg the polymorphic base. In a further embodiment the array comprises at feast 10 of the bialleiic markers in the 

30 maps of the present invention or a fragment comprising at least 10. at least 15. at least 20, at least 25, or more than 25 

consecutive mideotkles thereof tnduding the polymorphic base. For example, the array may comprise at least 10 of any of 
the 653 biaUeHc markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 51*100), the asthma- 
associated bialleTic markers, the PG1 biallefic markers, and the new Apo E faiaOeCc markers Gndudino the sequences of 
SEQ ID Nos. 301-305/307-31 1) or the sequences complementary thereto, or a fragment comprismg at least 10, at least 1 5, 
35 at least 20, at least 25, or more than 25 consecutive nucleotides thereof including the poiymorphtc base. In a further 
embodiment the array comprises at least 20 of the biaOeGc marker in the maps of the present invention or a fragment 
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comprising at (east 15 consecutivs nucleotides thereof including the potymorphic base. For example, the array may comprise 
at least 20 of any of the 553 biallsHc markers obtained above (whtdi include tho sequences of SEQ 10 Nos. 1-50 and 51- 
100), the asthms-associated biallelic markers, the PG1 bialtelic markers, and the new Apo E biallclic markers (iiicludino 
the sequences of SEQ ID Nos. 301-305/307-311) or ttie sequences complomcritarY thereto, or a fragment comprising at 
5 least 10, at least 15, at least 20, at least 25, or more than 25 consecutive nuckotides tliureof including tlic polymorphic 

base, in a funher embodiment the array comprises at least 100 of tho biatlclic markers in the maps of the present 
invention or a fragment comprising at least 10, at least 15, at least 20. at least 25, or more than 25 consecutive nuclootides 
thereof including the poiymorplac base. For example, the array may comprise at kiast 100 of any of Urn C53 biallotic 
markers obtained above (which include the sequences of SEQ 10 Nos. 1-50 and SMUG), tttc asthma-ossuciated bialtelic 

10 markers, the PQ1 biallelic markers, and the new Apo E biallelic markers (inchiding tlie sequences of SEQ 10 Nus. 30V 

305/307-311) or the saquoncos complementary theroto, or a fragment comprising at least 10, at least 15, at least 20, at 
kiast 25« or more than 25 consecuthre nucleotides tluircol including Uie polymorphic base. In a further embodiment the 
array comprises at least 200 of the biallelic markers in the maps of the present invention or a fragment thereof comprising 
at least 10. at least 15, at least 20, at loast 25, or more than 25 consecutive nucleotides tliercof including tlu! polymorphic 

1 5 base. For example, the array may comprise at least 200 of any of the G53 biallelic markers obtained above (which include 

the sequences of SEQ 10 Nos. 1-50 and 51*100). the asthma associated biallclic markers, the PG1 biallelic markers, and 
the new Apo E biallclic markers rinchjdmg the sequences of SEQ ID Nos. 301-305/307 311) or the sequences 
complementafy tliereto. or a fragment comprising at least 10. at least 15, at loast 20, at loast 25, or more than 25 
consecutive nucleotides thereof including the polymorphic base. In a further embodiment the array comprises at least 300 

20 of the biallelic maricers in the maps of the present invention or a Iragment comprising at least 10, at least 1 5, ot least 20, at 
least 25, or more than 25 consecuthra nucleotides thereof including the polymorphic base. For example, the array may 
comprise at least 300 of any of the 653 biaUetic maricers obtained above (which include the sequences of SEQ ID Nos. 1- 
50 and 5M0O), the asthma-associated biallelic markers, the PG1 biallelic markers, and the new Apo E biallelic markers 
(including the sequences of SEQ 10 Nos. 3Q1-305I307-311) or the sequences complomentary thereto, or a fragment 

25 comprising at least 10, at least 15, at loast 20, at least 25, or more than 25 consecutive nucleotides thereof including the 

polymorphic base. In g further ambodiment the array comprises at least 400 of the bjaUelic maricers in the maps of the 
present wvention or i fragment compriskig at least 10, at least 15, at least 20, at least 25, or more than 25 consecutive 
nucleotides thereof inchiding the polymorphic base. For example, the array may comprise at least 400 of any of the 653 
bialtelic markers obta'uied above (which include the sequences of SEQ ID Nos. 1 -50 and 51 -1 00), the asthma associsted 

3D biallelic markers, the FG1 bialla&c markers, and the new Apo E biallclic markers (induifing the sequences of SEQ ID Nos. 

301*305/307-31 1) or the scquances complementary thereto, or a (ragmont comprisinQ at least 10, at least 15, at least 20, 
at least 25, or more than 25 consecutive nucleotides thereof including the polymorphic base. In a furthor embod'uncnt the 
array comprises more than 400 of the bialefic markers in the maps of the present invention or a fragment comprising at 
least 10, at least 15. at least 20^ at least 25, or more than 25 coosecuthre nudeotidas thereof nchidmg the polymorphic 
35 base. For example, the array may comprisa at least 400 of any of the 653 biaUeBc markers obtained above (which include 

the seouences of SEQ 10 Nos. 1-50 end 5M0O), the asthma-associated biallelic markers, the PG1 biallelic markers, and 
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the new Apo E biaKelic markers (including the sequsncas of SEQ 10 Nos. 30V3a5/307-311) or (he sequences 
compkMncntary thereto, or a fraQjnent comprising at least 10, at least 15, at least 20, at loast 25, or moro than 25 
consecutive nucleotides theroof indudinQ the polyniorphic base. Each of tho crnbodinicnts fisted ahovc may also include one 
or more of the sequences of SEQ ID Nos. 306 and 312 in addition to those enumerated above. 
5 Another embodiment of tha present invention is an array comprising amplification prlntcts for yuncrating 

amplification (troducts containing ttm pulymarphic bases of one or more, at feast five, at least 10, at least 20, at least 100. 
at least 2Q0, at least 300, at least 400, or more ttian 400 of the liiallcGc markers in ttic maps of the presBnt invention. For 
example, the array may compitsc amptiHcaUon primors (or Qcncrating amplification products containutg the polymoiphic 
bases of one or more, at least fm, at least 10. at ioast 20, at k;ast 100. at least 200, at least 3U0, at loast 400, ur iimre 

10 than 400 of any of the 653 btdilolic markers obtained above (which include the sequences of SEQ 10 Nus. 150 and 51* 

100 or the sequences compkimentary thereto), the asthma associated biallclic markers, tho PG1 biallulic markers, and 
the new Apo £ biallclic markers (including the sequences of SEQ ID Nos. 301-305/307-311 or the sequences 
complementary thereto). In such aaays, the amp&fication primers included in the array arc capable of amplilytn^ the 
biatleltc marker sequences to be detected n the nucleic acid sample applied to the array (i.e. the omplification primers 

15 correspond to the biallelic maikcis alfixcd to the array). For example, if the array is designed to detect t)ie biallclic marker of 

SEQ 10 Nos. 1 and 51 it may also conta'n SEQ ID Nos. 101 and 151. the amplification primers capable of oenerating an 
amplicon which includes sequenco 10 Nos. 1 and 51. Thus, the arrays may include one or more of the amplification primers 
of SEQ 10 Nos. 101*200, 313-317, and 319-323 corresponding to the one or more biallclic markers of SEQ 10 Nos. 1-50. 
51*100. 301>305, and 307-311 which are included in the array. In other embodiments, the arrays may include 

20 amplification primers capable of generating an amptificaiion product which inctud&s the biallelic markers SEQ ID Nos. 

306 and 312 in addition to amplification primers capable of generating an amplification product containing each of the 
markers enumerated above. Thus, in such embodiments, the arrays may further include the amplification primers of SEQ 
ID )\los. 31 a and 324. 

Another embodiment of the present invention is an array which includes miaosequencing primers capable of 
25 dotermining the identity of the polymorphic bases one or more, it least five, at least 10, at least 20. at least 100. at least 
200, at least 300, at lean 400, or more than 400 of the biallelic markers in the maps of the present inventjon. For 
example, the array may compri^ nncrosequencing primers capable of determining the identity of the polymorphic bases of 
one or more, at least five, at least 10, at least 20, tt least 100* at least 200. at least 300, at least 400, or more than 400 
of the 653 iiialleOc markers obtained above (which include the sequences of SEQ 10 Nos. 1*50 and 51*100 or the 
30 sequences complementarv thereto), the asthma*associated biallelic markers, the PG1 biallelic markers, and the new Apo 
E biallelic markers (including the sequences of SEQ ID Nos. 301 •305/307-31 1 or the sequences complementary thoroto). 
The sequences of representative mtcrosequenclng primers which may be included in the anay are listed in the sequence 
listinQ as SEQ 10 Nos, 201-30a 325*329, and 331-335. In other embodiments, the arrays may fuaher include 
nucrosequencing primere for determining the identity of the polymorphic bases of one or more of the sequences of SEO 
35 to Nos. 306 and 31Z such as the mucrosequenciog primers of SEQ 10 Nos. 330 end 336. 

Arrays containing any combination of the above nudeic acids which permits the specific detection or 
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identification of the polymorphic ba&os of the biallotic markers in the maps of tlia present invention, indudmo any 
combination of the 653 biallctic markers obtainad above (which include iho sequences of SEQ 10 Nos. t -50 and 5 MOO 
or tho sequences complemenlarv thereto), the asthma-associated btallcnc markers, the PG1 bialletic mnrVcrs. and itic 
new Apo E biatlclic markers (IncluriinQ (he sequences of SEQ )D Nos. 301-305/307.31 1 or the sequences comptemuntary 
thereto) are also within the scope of the present invention. Other cmbotflmcnts of the arrays include nucleic acids wtiicli 
permit the specific detection or identification of the polymorpliic bases of one or more of SEQ ID Nos. 306 and 312 in 
addition to the nucleic acids permittina the specific detection or identication of tlic polymorphic bases of the biallBlic 
markers listed in the precGding sentence. For example, the array may comprise both the biatlclic markers and 
amplification primers capable of generating amplification products containing the palytnorphic bases of tho biallelic 
markers. Alternatively, the array may comprise both amplification primers capable of gcneratinu amplification prnducts 
containing the polymorpliic bases of the biallctic markers and microsequonctng primers capable of detcminino the 
identities of the polymorptuc bases of these markets. 

Although the above examples describe arrays comprising specific groups of bialletic markers and. in some 
embodiments, specific amplification primers and microsequencing primers, tt will be appreciated that the present 
invention encotnpasscs arrays mcluding any bialieOc marker, gioup of biallctic markers, amplification primer, group of 
amplification primers, microsequencing priner, or group ol amplification primors described herein, as wall as any 
combination of the preceding nucloic acids. 

Alternatively, the microsequencing procedures described above may be used to determine whether an individual 
possesses a pattern of bialletic marker alleles associated with a detectable trail In this approach, a PCR reaction is 
performed on the ONA or UNA of the tntfividual to be tested to amplify the desired biaUelic markers or portions tlujreot. The 
amplification product is hybridized to one or mora oligonucleotides having their 3' end one base trom the position of the 
polymorphic faasos of the biaOelic markers which arc f bred to a surface. The oligonucleotides are extended one base using a 
detcctably labeled dNTP and a polymerase. Incorporation of a pattern of detectably labelod bases indicative of a biallclic 
marker pattern associated with a detectable uart indicates that the individual suffers from a detectable trait as the result of 
a particular mutation or that the individual is at risk for developing the detectable trait at a subsequent time. 

In addithm to their use in disgnosttc techniquas such as those described above, any of the arrays described above 
may also be used to identify a hapbtype Tlo. a set of alleles of brallelic markers) which is associated with a particular trait. 
As desaibed above, in such analyses nucleic acid samples are obtained from trait positive and troit negathrc individuals and 
the alleles of biaHeKc markers pres&it in each population are determined to identify a haplotype which is statistically 
associated with the trait The arrays may be employed in haplotypo analyses as follows. Nucleic acid samples obtained 
from trait positive and trait ragative indwiduals are amp&fied uvith primers capable of gonorating omplification products 
which t(u:lude the polYnwiptuc bases ol the bialletic markers. The amplification products are labeled wHh a reporter group 
and altowed to contact the btaltetic marker probes which are anachod to the support As descrft>ed above, the biailefic 
marker probes to which the labeled vnpKficatmn products specifically hybridize are detennined to indicate which aUeles of 
the biaDeltc markers are present in the samples. The patterns of alleles of b'lalle&c markers in the trait posithre and trait 
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negaiivG individuals ara than iletennincd to identify a haptotype having a statistically significant association with tlio trait. 

Alternativdy, as dcsaibed abova. the nucleic acid samples fiom trait positive and trait nogativc individuals may be 
applied to an array comprisinQ amplification primers capablo of Qcncrating ampb'fication products which include ttic 
polymorphic bases of the bialleic markers. The idsntitiGS ol the polymorphic bases in the amplifrcaiion products arc tficn 
5 determined using techniques such as th« microsoqucncing p/oceduros disclosed herein. Altcmativoly. amptiftcntian can be 

conducted in Oquid fdiasc and microsequencing may be conducted on tiic array. 

Alternatively, both amplrficatian and miaoscquencing reactions may be pcrtonned in liquid phase. In sucti 
embodiments, tlic labeled nucleotides incorporated in itic miaosoqucncing primers during itic micrcsoqucnciiig reactions are 
detoctGd by hybridizino the extended microsequencing primers to sequences complementary to ttic microsequenctno juirners. 
10 The sequences complementary to the nucrosoquencing primors are immobiUzcd on a support, such as those described above. 

The amplificatiun and microsequencing reactions performed in liquid phase may be multiplexed, allowing the samples to be 
tested simultaneously for tens, hundreds, thousands or more biallclic markers. 

Preferably, tfie array used in tho haplotype analysis comprises one or more groups of biallclic markers known to be 
located in proiimitY to one another in the genome. For example, itic hiallelic markers in the groups may be derived from a 
15 single YAC insert a single BAG insert or a BAC subclone. Altennativety, the biallclic markers in the groups may be dorived 

from adjacent ordered clones. The biaMic markers in the groups may be located within a genomic region spanning less ttian 
Ikb, from 1 to 5kb, from 5 to lOkfa. from 10 to 25kb, from 25 to 50kb, from 50 to 150kb. from 150 to 250kb. from 250 to 
SOOkb, from SOQkb to 1Mb, or more tlian 1Mb. In some embodiments, the biallalic markers in the groups coniprisu luallclic 
markers which have been focalized to the same chromosome, subchromosomal region, or gene. 
20 It win be appreciated that the ordered DMA containing the bialletic markers need not completely cover ttic genomic 

regions of these lengths but may instead be rcompletc contigs having one or more gaps therein. 

In some embodiments, the bidUoTic markers known to be located in proximity to one another in the genome may bo 
located in physical proximity on the array. For example, the array may comprise one or more groups of at least 3 bialletic 
markers known to be located tn proximity to one another in the genome. bi some embodiments, the array may comprise one 
25 or more groups of at least 6 btallefic markers known to be located in proximity to one another in the genome. In other 

embotfiments, the array may comprise one or more groups of at least 20 biallelic markers known to be located in proxtmity 
to one another In the genome. 

The array may comprise one or more groups of bialteKc markers known to be located on the same subchromosomal 
region. For example, the array coukl comprise two or more biaOefic markers located at 21q1l.2 1 selected from the group 
30 consisting of SEQ ID Nos. 29, 79, 30 and 60 1, two or more markers located at 21q21 (selected from the group consisting of 
SEQ ID Nos 1, 51. Z 52. 3 and 53), two or more markers located at 21q21^ (selected from the group consisting of SEQ ID 
Nos 17, 67, 16, 68, 19. 69. 20. 70, 21, and 71) , two or more markers located at 21q21.3^22.13 (selected from the group 
consisting of SEQ 10 Nos 25, 75, 26, 76, 27, 77, 28, 7S, 31, 81, 32, 82, 38, 88, 39, 89. 40, 90, 48, 88, 49, 99. 50. 100. 
22. 72. 23, 73, 24, 74, 4, 54, 5, 55, 6. 56. 7. 57, 8, 58, 9, 59, 10. 61 1 1, 61, 12, 62, 1 3. S3, 14, 64, 15, 65, 1 6, and 66 
35 i two or more markers tocated at 21 q212 (selected from the group consisting of SEO ID Nos 41 , 9 1 , 42, 92, 43, 83, 44, 

94. 45. 95, 46. 96, 47, and 97) , and two or more markers k)cated et 21q22.3 (selected from the group consistino of SEQ 
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10 Nos 33, 83, 34, 84, 35, 85, 36, B6, 37, and 87). Alternatively, the array could compriso amplification primers capable of 
generating an amplification product containing tlie polymorphic basBS of two or more bialtcfic markers located at Zlqll.Z ( 
for Gxampio, amplification primors capablo of Qwierating an amplification product containing the polymorphic bases of two or 
more biallcHc maiVcrs selected from the group consisting of SEQ 10 Nos. 29, 70. 30 and 80 ). two or more markers located 
5 at 21q21 (for example, amplificatian primers capable of generating an amplification product containing the polymorphic 

bases of Xvto or more biaDcfic markers selected from ttic group consisting of SEQ ID Nus 1, 51* 2. 52. 3 and 531 two or 
ntore markers located ot 2lti212 (fur example, omplification primers cspoble of generating an amplificiilion prnditct 
containing the potymorphic basos of two or more biallanc markers selected from the group consisting of SEQ 10 Nos 17. 57, 
1 8, G8, t9, 69, 20, 70, 21, and 7t) , two or more maikers located at 21q21.3 q2Zt3 (for example, amplification primors 
10 capable of generating an amplification product containng the palymorptiic bases of two or more btalteiic markers selected 

from the group consistinQ of SEQ ID Nos 25, 75, 26, 76. 27. 77. 28. 78. 31. 81. 3Z 81 38, 88, 39. 89. 40, 90. 48, 98, 49. 
99, 50, 100. 22, 72. 23, 73. 24. 74, 4, 54, 5. 55. 6. 56, 7. 57. 6. 58. 0, 59. 10, 60, 11, 61, 12, 62, 13, 53. 14. 64. 15. 
65, 16, and 66 ). two or more markers located at 2tq22.2 ( for eiamplc, amplification primers capable of generating an 
amplification product containing the polymorphic bases of two or more biallelic markers solocied from tfic group consisting 
15 of SEQ ID Nos 41. 91, 41 92, 43. 93. 44. 94, 45, 85. 46. 96. 47, and 97) . and two or mote markers located at 21q2Z3 

{for eiamplc, ompiification primers capable of Qencrating an amplification product containing the polymorphic bases of two 
or more biaHoiic markers selected from the group consisting ol SEQ 10 Nos 33. 03. 34, 04, 35, 85. 36, 86, 37, and 87). 

In some embodiments, the array may comprise one or more groups of biallcPc markers derived from the same DAC 
insen. For example, the array could comprise two or more markers selected from the group consisting of SEQ 10 Nos. 20. 
20 79, 30. and 80 (derived from BAC M two or more markers selected from the group coiuisting of SEQ 10 Nos. 1 and 51 

(derived from 8AC 2). two or more markers selected from the group consisting of SEQ ID Nos. 2 , 52, 3, and 53 (derived 
from BAC 31, two or mora markers selected from the group consisting of SEQ ID Nos. 17, 67, IB, 68, 19, 69, 20, 70, 21, 
and 71 (derived from BAC 4), two or more markers selected from the group consisting of SEQ 10 Nos. 25, 75, 26, 76. 27. 
and 77 (derived from BAC 5), two or mors markers sleeted from the group consisting of SEQ 10 Nos. 28, 78, 31 , 81, 32, and 
25 82 (derived from BAC 6), two or more markers selected from the group consisting of SEQ 10 Nos. 38. 66, 39, 89, 40, and 
90 (derhred from BAC 7), two or more markers selected from the group consisting of SEQ ID Nos. 48, 98, 49, 99, 50, and 
100 (derived from BAC 8k two or more markers selected im the group consisting of SEQ fS Nos. 2^, 7Z, 23, 73, 24, and 
74 (derived from BAC 9), two or more markers selected from the group consisting of SEQ 10 No& 4, 54, 5, 55, 6, 56, 7, 57, 
8. 58. 9. 59. 10. and 60 (derived from BAC 10), two or more markers selected from tho group consisting of SEQ ID Nos. 
30 11, 61, IZ 62. 13, 63. 14, 64. 15. 65, 1$. and 66 (derived from BAC 111, two or more markers selected from the group 

consisting of SEQ ID Kos. 41. 91, 4Z 92, 43. 93, 44, 94, 45. 95. 46, 96. 47, and 97 (derived from BAC 12}, or two or more 
markers selected from the group consisting of SEQ 10 Nos. 33, 83, 34, 84, 35. 85, 36. 86. 37. and 87 (dorivcd from BAC 
13). 

Arrays comprising biaDdic markers known to be located n proximity to one another in the genome permit 
35 hafriotyping analyses to be conducted even when the chromosomal locatkms of the biallelic markers has not been 

determined. For examole, usine the procedures tlescribed above, the alleles of sets of faidleiic markers which are oresent m 
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nudeic acid samples from trait positive and trait negative individuals may be determined using a succession of arrays with 
each array fiaving one or more groups of nucleic acids known to be located in proximity to one another thercuii Ttic 
succession of arrays may comprise btalcKc marltcrs spanning the entiro gonomc havino any of the average intcrmarkcr 
distances spcdfied above. Altomativcfy, the succession of arrays need not span the entire genome but may instead be 
5 derived from two or more contigatcd YAC, BAC, or BAC subclone inseni A statistical nitafysis is performed on tiie nlldcs 

of biaRolic maikcrs present in the trait positive and trait negative individuats to identify <i haplotype hav'aig a statisticaQy 
significant association with tiic trait Once a stau'sticafly significant haplotype is identified, the OGiiomic locotions of the 
biallclic markers comprising the haplotype may be determined using tfic methods dcsciiljud herein. In addition, using the 
procedures described herein, lite genomic rcoion harboring the biallelic markers in the statistically significant haptuty|iu may 
1 0 be evaluated to identify the genes associated with tha trait. 

AlthouQli this invention has boon doscribod in terms of certain preferred embodiinents, other embodimonts wludi 
%tfill be apparent to those of ordinary skill in the art in view of the disdosure homin are also within the scopo of tfiis 
invention. Accordingly, the scope of the invention is intended to be defined only by refonncc to the appcndod claims. 
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Bialletlc marker 
(Genset code) 


BAC 


Insert size 
(kb) 


average Intermarker 
distance (kb) 


subchromosomal 
localization 




99-2378 
99-2381 


1 
1 


150 
150 


75 
75 


2lqn.2 
21q11.2 



99-2103 2 no m 21q2l 



99-2228 


3 


105 


52.5 


21q21 


99-2229 


3 


105 


52.5 


2lq21 



99-2312 


4 


130 


26 


21q21.2 


99-2315 


4 


130 


26 


21q2l.2 


99-2320 


4 


130 


26 


21q21.2 


99-2321 


4 


130 


26 


21q21.2 


99-2324 


4 


130 


25 


21q21.2 



99-2362 


5 


100 


33.3 


21q21.3-q22.13 


99-2364 


5 


100 


33.3 


21q21.3*q22.l3 


99-2367 


5 


100 


33.3 


21q2l.3-q22.13 



99-2371 


6 


135 


45 


21q22.n-q22.l3 


99-2413 


6 


135 


45 


21q22.11-q22,l3 


99-2419 


6 


135 


45 


21q22.1t-q22.13 



99-2610 


7 


185 


61.7 


21q22.n-q22.13 


99-2615 


7 


185 


61.7 


21q22.l1-q22.13 


99-2620 


7 


165 


61.7 


21q22.11-q22.l3 



99-2645 


8 


250 


83.3 


21q22.11-q22.l3 


99-2647 


6 


250 


83.3 


2lq22.11-q22.13 


99-2649 


8 


250 


83.3 


21q22.11*q22.13 



99-2333 


9 


140 


46.7 


21q22.11-q22.l3 


99-2341 


9 


140 


46.7 


21q22.11-q22.13 


99-2342 


9 


140 


46.7 


21q22.11-q22.13 



99-2240 


10 


95 


13.6 


2lq22.11-q22.13 


99-2242 


10 


95 


13.6 


21q22,11-q22.13 


99-2244 


10 


95 


13.6 


21q22.1l-q22.l3 


99-2246 


10 


95 


13.6 


21q22.11-q22.13 


99-2248 


10 


95 


13.6 


21q22.11-q22.13 


99-2250 


10 


95 


13.6 


21q22.11-q22.13 


99-2251 


10 


95 


13.6 


21q22.1l-q22.13 



99-2269 


11 


40 


6.7 


21q22.11-q22.l3 


99-2271 


11 


40 


6.7 


21q22.11-q22.13 


99-2272 


11 


40 


6.7 


21q22.11-q22.13 


99-2273 


11 


40 


6.7 


2lq22.11-q22.13 


99-2275 


11 


40 


67 


2lq22.ll-q22.13 


99-2278 


11 


40 


6.7 


21q22.11-q22.l3 




99-2624 


12 


165 


23.6 


21q22.2 


99-2625 


12 


165 


23.6 


21q22.2 


99-2630 


12 


165 


23.6 


21q22.2 


99-2633 


12 


165 . 


23.6 


_21q22.2 


99-2634 


12 


165 


23.6 


21q22.2 


99-2637 


12 


165 


23.6 


21q22.2 


99-2642 


12 


165 


23.6 


2lq22.2 



99-2559 


13 


205 


41 


21q22.3 


99-2566 


13 


205 


41 


21q22.3 


99-2567 


13 


205 


41 


2lq22.3 


99-2570 


13 


205 


41 


21q22.3 


99-2571 


13 


205 


41 


2lq22.3 
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WHAT IS CLAIMED iS : 

1. A fncltiod of obtaining a sat ol bialldic markers compnsing the steps of: 

olilnimnQ a nucleic acid library comprisinQ a plurnlity of genomic DNA fraginiMils comprising the full ocnonie or 
5 a portion thereof; 

dotciminmo tlie order of said plurality of gsnomic ONA fragments in tfie gonomo; 
determining the sequence of selected regions of said plurality of genomic DMA fratKncnts; and 
identifying nucleotides in said plurafity of genomic ONA fragmenls whicfi vary between indtvidunls. thorsbv 
doftning a set of biattclic markers. 
10 2. Tlic method of Claim 1. wherein said idcntilying step comprises identifying about 20.000 binllelic 

markers. 

3. The method of Claim 1, wherein said idcniifying sicp comprises identifying about 40,000 biallclic 

markers. 

4. Tlte method of Claim 1, wherein said identifying step comprises identifying about 60,000 biallclic 

15 markers. 

5. The method of Claim 1, wherein said identifying step comprises identifying about 60,000 bialUlic 

markers. 

6. The method of Claim 1, wherein said identHying stsp comprises identifying about 100,000 biallclic 

markers. 

20 7. The method of Claim 1, wherein said identifying step comprises identifying about 120,000 biallclic 

markers. 

8. The method of Claim 1, wherein said biatlelic markers arc separated from one another by an average 
distance of 10kb-2Q0 kb. 

9. The method of Claim 1, wherein said biaflelic markers aro separated from one another by an average 
25 distance of 15kb<150kb. 

10. Pie method of Claim 1, wherein said biatlefic markers are separated from one another by an average 
distance of 20kb-100kb. 

11. The method of Claim 1, wherein said biallenc markers are separated from one another by an average 
distance of 100kb*150 kb. 

3D 1 2. The method of Claim 1 , wherein said tnaHelic markers are separated from one another by an average 

distance of 50-1QQkb. 

13. The method of Claim 1, wherein said btaKelic markers are separated from one another by an average 
disunce of 25kb-5Q kb. 

14. Ihe method of Claim 1, wherein the step of detenmning the sequence ol selected regions of said 
35 plurality of genomic DNA fragments comprises insertmg fragments of said plurality of genomic ONA fragments into a 
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vector to Qcncrate a plurality of subclones and determining the soquoncc of a region of the inserts in said plurality of 
subclones or a subset tficrcof. 

1 5. The method of Claim 14, wherein said step of detcimininu tlic sequence of • rogien of said inserts or 
a subsot thereof comprises dctonninino the sequence of one or bath end regions of said inserts oi a subset thereof. 
5 16. The method of Claim 14, wherein the step of determining the soqucacc of one or both end regions of 

said pkirality of subclones comprises determining the sequence of about 500 bases at each end of said subcluncs or a 
subsot thoreof. 

17. The method of Claim 1, wherein a set of about 10.000 to about 20,000 genomic ONA inserts with 
an average size between lOOkb and 300kb are ordered. 
10 18. The method of Claim 1. wherein a set of about 10,00(1 to about 30,000 genomic Of^A inserts with 

an average size between lOOkb and 150 kb are ordered. 

19. The method of Claim 1, wherein a set of about 15.000 to about 25,000 genomic ONA inserts with 
an average size between IQOkb and 200 kb arc ordered. 

20. The method of Claim 1. wherein said identifying step comprises identifying between 1 and 6 bialteiic 
1 5 markers per genomic DMA fragment. 

21. The mithod of Claim 1, whoroin said identifying step comprises identifying an average of 3 biallclic 
markers per Qcnomic ONA insert. 

22. The method of Claim 1, wherein said genomic ON A fragments are in a Bacterial Artificial 
Chromosome. 

20 23. The method of Claim 1 , wherein said genomic ONA fragments are in a Yeast Artificial Chromosome. 

24. The method of Claim 1, fuaher comprising dGtcrmining ihc position of said biallelic markers along ihc 
genome or a portion thereof. 

25. The method of Claim 24, wherein the step of detirmining the position of said bialletic markers along 
the genome or portion thereof comprises determining the position of said bialletic markers along a chromosome. 

25 26. The method of Claim 24, wherein the step of dettmuning the position of uid biallelic markers along 

the genome or portion thereof comprises deteimining the position of said biallelic markers along a subchromosomal 
regbn. 

27. The method of Claim 1, (orthcr comprising identifyinii biallelic markers which arc in linkage 
disequilibrium with one another. 

30 28. The method of Claim 27, further comprising obtaining pluranties of biallelic markers such that oach 

marker is m linkage dtsetntiGbrium with at least one identifie markers. 

29. The method of Clakn 1, wherein said poaion of the genome comprises at least 200 kb of contiguous 
getumtic ONA. 

30, The method of Claim 1 , wherein s»d portion of the genome comprises tt least 300 kb of contiguous 
35 genon^c ONA. 
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Ths mothod of Claim 1. whorein said portion of the genome comprisos at least 500 kb of cuntiguous 

The method of Claim 1, wherein said portion of the gcnomd comprise:! at least 2 Mb of contiQuous 

The method of Claim 1, wherein said portion of the ocnome comprises at least 5 Mh of contiguous 

The method of Claim 1, wherein said portion of the genome comprises at least 10 Mb of contiuuous 

The methuJ of Claim 1. wherein said portion of the Qcnomc comprises at least 20 Mb of contiguous 

The metlioti of Clatm 1, further comprisino the step of idcotifying one or more groups of biallutic 
markers which are in proximity to one another in the genome. 

37. The method of Claim 36, wherein the biaKelie matkcrs in each of these groups are located within a 
genomic region spanning less than Ikb. 

38 The method of Claim 3G, wherein the biallclic markers in each of these groups arc located within a 
genomic region spanning from 1 to 5kb. 

39 Ttie method of Claim 36, wherein the biallelic markers in each of these groups arc located within a 
genomic region spanning from 5 to lOkh. 

40 The method of Claim 36, whcroin the biallclic markers in each of these groups arc located within a 
genomic rcQion spanning from 10 to 25kb. 

41 The method of Claim 36, wherein ttic biallelic markers in each of these groups arc located within a 
genomic region spanning from 25 to SOkb.. 

42 The method of Claim 36, wherein the biallelic markers in each of these groups are located within a 
genomic region spanning from 50 to ISOkb. 

43 The method el Qaim 36, wherein the biallclic markers in each of these groups ere located within a 
genomic region spanmng from 15Q to 2S0kb. 

44 The method of Clekn 38, wherein the biallelic markers to each of these groups are located within a 
genomic region spannino from 250 to SOOkb. 

45 The method of Claim 36, wherein the biallclic markers in each of these groups are located within a 
genomic region spanning front 50Qkb to 1Mb. 

46 The method of Daim 36, wherein the iHallclic markers in each of these groups are located within o 
genomic region spanning more than Ift/th. 

47. A method of obtaimng i set of braBeftc mailters cooiprismg the steps of: 

obtaining a mtdeic add library comprising genomic DNA fragments comprising the hjU genome or a portion 

thereof; 

determining the sequence of selected regions of said qenomtc DNA frapmems: 



WO 99/04038 
31. 

genomic ONA. 
32. 

genomic DNA. 
33. 

genomic ONA. 
34. 

genomic ONA. 
35. 

genomic ONA. 
36. 
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identifying nucleotides in said ocnomic DNA fraomcnts which vary botween individu^ts. thereby defining a set 
of faiatlctic markers; ond 

doterminino the order of said biallelic markers along the gnnemc or poMion thereof. 

48 A set of biailelic markers obtained by tho method of Claim 1. 

49 The set of btaUcTrc markers of Cbim 41), whorem the ninrkcfs h suit} set have a knuwii i;tMturnic 

position. 

50 Tho set of biattclic mai kers of Cbim 40, whoroin tfic arc onluf cJ rclativo to one anotlicr. 

51. A set of bialteric markers havtno o known rclatiunship to one another and a known Qcnomic position, 
said set of biatlcKc markers beinQ obtained by the method of Claim 1. 

52. The set of btallclic markers of Claim 46, whcroin said bialleifc markers have hotorozygosity Mfox uf 
at least about 0.18. 

53. The set of bialiclic markers of Claim 46, wherein said bialtcfic markers have tioterozygosity rate of at 
least about 0.31 

54. The set of bialiclic markers of Claim 46. wherein said biallolic markers have a heterozygosity rate of 
at least about 0.42. 

55. A map comprising an ordered array of at least 20.000 bialiclic markers obtained by the method of 

Claim 1. 

56. The map of Claim 55, comprising an ordered array of at least 60,000 bialiclic markers obtained by 
the method of Claim 1. 

57. The map of Claim 55 comprising an ordered array of at least 100.000 bialiclic markers obtained by 
the method of Claim 1. 

58. The map of Claim 55. wherein said biatlelic markers are distributed at an avcrago marker density of 
one marker every ISOkb. 

59. The map of Claim 56, whotein said bialiclic markers are distributed at an avcraQC marker donsity oi 
one marker evoiy 50 kb. 

60. The map of Clakn 57, wherein said btaltc&c markers are distributed at an average marker density of 
one marker every 25 kb. 

61. A method of identifying one or more biallolic markers associated with a detectable trait comprising 
the steps of; 

determtning tho frequencies of each allele of one or moic bialolic markers obtained by the method of Claim 1 
in individuals who oxprtss said dotectable trail and mdtviduils who do not express said detectablo trait: and 

identifymQ one or more alleles of said one or more biallolic markers which are statisttcaily assoctatod with the 
expression of said detectable trait. 

62. The tnethod of Claim 61, wherein said detectable trait is selected from the group consisting of 
disease, drug response, drttg ef f icacy« and drug toxicity. 
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63, The method of Claim 6K wherein the phcnotypo of said individuals who express soid detcctnbte tr.iit 
and the phcnotypo of soid individuals who do not express said detectable trait arc readily distinguishable from one 
anoihor. 

64. The method of Claim 61, wherein the individuals who express said detectable trait and the intltvithuils 
who do not cipross said detectable trait arc selected (rem i birnaiKil phcnotypc dislributiotu 

C5. The method of Claim 61. whtrcin said indiwiduals who express said detectabli trait arc at nnu 
phcnotypic extreme of the population and said individuals who do not express said detectable trait ere at the uthcr 
phenotypic extreme of the population. 

66. A method of tdetitifying a haplotypc assodaterf wnh > trait comprising the 5lcp$ of; 
obtaining nucleic icid samples from trait positive and trail negaiivc ifulividuals; 

determimng the frcituoncios of the allclcs of exh momber of o group pf biallcHc markers obtained by the 
metliod of Claim 1 known to be located proximity to one another in the genome in said nucleic acid samples; and 

identifying a plurality of alleles of biallclic markers having a statistically sionificant association with said trail, 

67. The method of Claim 6C, wherein said detectable trait is sclccicd from the group consisting of 
disease, drug response, drug efficacy, and drug toxtcity- 

68. The method of Claim 6G, wherein the faiatlenc markers in each of those groups ere located within a gunomic 
region spanning less than Ub. 

69 The method of Claim 66, wherein the biaDclic markers in each of these groups arc located wiihin a 
genomic region spanning from 1 to 5kb. 

70 The method of Claim 66, wherein the biallolic markers in each of these groups arc located within a 
genomic region sp^ning from 5 to IQkb. 

71 The method of Claim 66, wherein the biallclic markers in oach of these groups are located within a 
genomic region spanning from 10 to 25kb. 

72 Tbi method of Claim 66, wherein the bialiclic markers in each of these groups arc located within a 
Qinomic region stunning from 25 to 5Qkfa. 

73 The method of Clean 66, wherein the biallelic markers in each of these groups ere located within a 
genome region spinfltng from 50 to 150kb. 

74 The method of Claim 66, wherein the biallclic markers in each of these groups are located v/ithin a 
genomic region spanning from ISO to 250kb. 

75 The method of Claim 66, wherein the biaUeUc markers in each of these groups arc located v/nhtn a 
genomic ragion spanrang ffom 2S0 to SOOkb* 

76 Tbo method of Claim 66, wherein the biallclic markers in each of these groups are located wiihin a 
genonuc region spanning from SOQkb to 1Mb. 

77 The method of Claim 66, whereto the bialletic markers in each of these groups are located within a 
genomic region spanning more than tMb. 
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76. A method of idcntifyino one or more biallelic markers associated with a detectable trait comprising 
tho stops ol : 

sctL'Cting a gene in which mutations result in a detectable trait of a Qono stupocted of being associated with a 
detectable trait; atid 

idontifyino one or more biallelic nuirkcrs obtained by the method of Claun 1 within the genomic region 
harboring said ijcnc which are associated with said detectable trait. 

79. The molhod ol Claim 78« whoroiti said detectable irait is selected from tho group consistiii(| ul 
disease, drug response, drug efficacy, and drug loiictty. 

fiO. The method of Claim 78. wherein said identifying stop comprises: 

dctormining the frequencies of said one or more bbllclic markers in individuals who express saiil dntnctable 
trait and individuals who do not express said dctcctablo trait: and 

identifying one or more biailolic markers whidi arc statistically assodatcd with tlie exftression of said 
detectable (rait. 

81. An array of nucleic acids fixed to a support, said nucleic acids comprising at least 8 consecutive 
nucleotides, including the polymorptuc nucleotide, of one or moro biallelic markers obtained by the method of Claim 1. 

81 The array of Claim 81, wherein said nucleic acids comprise at least 8 consecutive nucleotides, 
including the polymorphic nucleotide, of at least five biallelic markers obtained by the method of Claim 1. 

83. The array of Claim 61. wherein said nucleic acids comprise at least 8 consccutiva nucleotides, 
including the polymorphic nucleotide, of at least ten biallctic markers otitaincd by the method of Claim 1. 

84. An array of nucleic acids fixed to a suppon, said nucleic acids comprising at least 8 consecutive 
nucleotides, including the pofymorpliic nucleotide, of one or more groups of biallelic markers known to be located in 
proximity to one anothot in the genome. 

85. An array of nucleic acids fixed to a support said nucleic acids comprising amplification primers for 
generating an amplification product comprising at least 8 consecutive nucleotides, including the pQlymorphic nucleotide, 
of one or more (liailclic markers obtained by the method of Claim 1. 

66. An array of nucleic acids Taed to a support said nucleic acids of comprising amplification primers for 
generating en ampfification product comprisnig at least 6 consecutive nucleotides, inchufing the polymorphic nucleotide, 
of one or more groups of tnaBelic markers known to be located in proximity to one another m the genome. 

87. An array of nucleic acids Tixcd to a support, said nucleic acids comprising one or more 
m'lcroscqucncing primers for determining tho identity of the polymorphic base of one or more nucleic acids comprising at 
Wast B consecutive nucleotides, induding tho polymorphic nucleotide, of ono or more biallelic markers obtained by the 
method of Claim 1. 

88. An array of nucleic adds futed to a support, said nucleic nucleic adds comprising one or more 
miaosequencing primers for determining the identity of the polymorphic bases of one or more groups of biallelic markers 
known to be located in proximity to one another in the genome. 
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69. An array of nudsic acids fixed to a support, whoroin said nucleic acids are complementary to one or 
more microscqucncing primers for dstcrmlnino the identities of llio polymorphic bases of one or more biallulic markers 
obtained by the method of Ctaim 1. 

90. The array of Cbtnt Q9, wherein snid nucleic acids arc complementary tu :il lenst five microscQuencinQ 
primers for dotermining the identities of the polymorphic bases of at least five binUctic innr kers uhlaiiied by the mothod 
of Claim 1. 

91 . The array of Claim 89. wherein said nucleic acids arc complomontary to at least ten microsequcncinu 
prlnners for detcrmininQ the identities of the polymorphic bases of at iciist ten biallolic markers obtained by the method 
of Cla'un 1. 

92. An array of nucleic acids fixed tu a support, said nucleic acids comprising one or more nucleic acids 
complementary to one or more microsequencing primers for determining the identity of the polymorphic bases ul une ur 
more Qroups of biallelic markers known to bo located In proximity to one another in tJic tienomc. 

93. The array of any one of Claims 64. 86. 86« and 92, wherein the members of each of said ono or more 
groups of biallolic markers arc located in physical proximity to one anoiticr on said support . 

94. The array of any one of Claion 84, 86. 68. and 92, wherein said biallclic markers in each of those 
groups are located witftin a gtmomic roQion spanning less than Ikb. 

95 The array of any one of Claims 84. 86. 88. and 92. wherein said biallclic markers In each of these 
groups are locatod within a genomic rcQlon spanning from 1 to Skb. 

96 The array of any one of Claims 84, 86, BB, and 92, wheroin tho bialletic markers In each of these 
groups me locatod within a genomic region spanning from 5 to lOkb. 

97 The array of any one of Claims 04, 86, 88, and 92. wherein the biallclic markers in each of these 
groups are located within a genomic region spanning from 10 to 25kb. 

98 The array of any one of Claims 84, 86, 88, and 92, wherein tho biallolic markers in each of these 
groups arc located within a genomic region spanning from 25 to 50kb. 

99 The array of any one of Claims 84, 86, 88. and 92. wherein the biallclic markers in each of these 
groups are located within a genomic region spanning from 50 to ISOkb. 

100 Tho array of any one of Claims 84. 86, 88. and 92, wherein the biallclic markors in each of these 
groups are located within a genomic region spanning from 150 to 250)tb. 

101 Tlie array of any one of Claims 84, 86, 88, end 92, wherein the biallclic markers in each of these 
groups are located within a genomic region spanning from 250 to SOOkb. 

102 The array of any ono of Claims 84, 86, BB. and 92, whorein the biaUclic markers in each of these 
groups are located within a genomic region spanning from SOOkb to 1Mb. 

103 The array of any one of Claims 84, 86, 88, and 92< wherein the biailelic markers in oach of these 
groups are located within a genomic region spanning more than 1Mb. 

104. T1t« array of any one of Claims 84, 86. 88, and 92, wherein each group of bialtelic markers 
comprises at lea^t 3 bialtelic markers. 
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105. Tha array of any anc of Claims 84. 8C« 66. and 92« wherein each orotip of binllclic markers 
comprises at least 6 biallelic markers. 

106. Ttn array of any oni of Claims 64. 6G, HU. and SI, whorstn each group of liinllelic markers 
comprises at least 20 biallelic tnnrkcrs. 

107. A method for dctcrminirtQ whothor an MmM \s at risk vf tiuvvloimn u duicctablu trait or suffers 
from a dolcctabic trait associated with said trait ctunfirisinu the stops of: 

obtainino a nucleic acid sample from said inilivtditat; 

screeninQ said nucleic acid sampio with one or more biutlulic markers ohiaiood by tho method of Claim 1; and 
determining whether said nucleic odd sample coiiLiins one or more of biallelic markors statistically 
associated with said detectable trait. 

106. The mcttwil of Claim 107. wherein said dciectalflu irait is scloctod from the group conststini) of 
disease, drug response. drtJQ efficacy and drug toxicity. 

1 09. The mothod of Claim 107, wherein said biallelic markers wcro obtained by the method of Claim 61. 

110. The method of Claim 107, wherein said biallelic markers were obtained by the method of Claim 70. 
in. A method of using a drug comprising: 

obtaining a nucleic acid sample from an indtvidual; 

datermininir Die identity of tho polymorphic base of one or more biallelic markers obtained by the method of 
Claim 1 wtrich Is associated with a positive response to treatment with said drug or one or more biallelic markers 
obtained by the method of Claim 1 which is associated with a negative response to treatment with said drug; and 

administering said drug to said individual if said nucleic acid sample contains one or more biallcHc markers 
associated with a positive response to treatment with said drug or if said nucleic acid samplo tacks one or more biallelic 
marken associated with a negative response to said drug. 

111 The method of Claim 111, wherein said determining step comprises determining the identity of the 
polymorphic base of one or more biatlolic markers obtained by the method of Claim 62 which is associated with a 
positive response to treatment with saki drug or ono or more bialteric markers obtained by the method of Claim 62 which 
is associated twith a negative response to treatment with said drug. 

113. The method of Cfatm 111. wherein said determinino step compri^s determming the identity of tho 
polymorphic base of one or more bialfelic markers obtained by the method of Claim 79 wluch is associated with a 
positive response to treatment with said drug or one or more btaUelic markers obtained by the method of Claim 79 which 
is associated with a negative response to treatment with said drug. 

114. A method of selecting an individual for inclusion in a clinical trial of a drug comprising; 
obtainino a nucleic add sample from an individual; 

determtning the identity of the polymorphic base of one or more biallelic markers obtained by the method of 
Claim 1 wtttch is associated with a positive response to treatment with said drug or one or more biallelic markers 
associated with a n^athro response to treatment with said drug in said nucleic acid sample; and 
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including said individual in said clinical trial if said nucleic acid sample contains one or mori; biallclic markers 
obtained by the method of Claim 1 which is associated with a positive response to treatment with said druQ or if said 
nucJctc acid sample locks one or more bialldic maricurs associated with a ncQativc response to said drug. 

115. The method of Claim 114, wherein said dclcrmininu step comprises determining tho identity of the 
polymorphic base of one or mors bidlletic markers obtained by the method of Claim 62 which is assuciatcd witli a 
positive ruspodse to treatment witlisaiil drug or one or more bialtclic markers obtaijicd liy the mutliud of Claim G2 wliicli 
is associated with i negative response to treatment with said drug. 

11G. Tlic method of Claim 1 14, whcrem said determining step compiisus duteimiiting the identity o( the 
polymorphic base of one or more bialldic markers obtained by tlic method of Claim 79 which is associatod with a 
positive response to treatment with said drug or one or mora bialloltc markers obtained by ttic mctitod of Claim 79 which 
is associated with a negative response to trcatmont with said druu. 

117. A method of idontif ying i gene associated whh a detectable trait comprising the steps of: 
determining the frequency of each allele of one or more bialldic markers obtained by tlie method of Claim 1 in 

individuals having said detectable trait and individuals lacking said detectable trait; 

{Jcntifying one or more allcfes of one or more bialldic markers having a statistically significant association 
with said detcctablo trait; and 

identifying a gene ui linkage disequilibrium with said one or more alleles. 

118. The method of Claim 78, further comprising identifying a mutation in the gene which is associated 
with said detectable trait. 

110. The method of Claim 78, wherein said detsctabic trait is sdected from the group consisting of 
disease, drug response, drug efficacy, and drug toxicity. 

1 20. A method of identifying a gene associated with a detectable trait comprising: 
selecting a geoo suspened of being associated with a detectable trait; and 

identifying one or more biattenc markers obtained by the method of Claim 1 v;ithin the genomic region 
harboring said gene which are nsocioted with said detectable trait. 

121. The method of Claim 120, wherein said detectable trait is selected from the group consisting of 
disease, drug responsCt drug efficacy, and drug toticitY. 

1 21 The method of Claim 120, whoroin said identifying step comprises: 

determining the frequencies of said one or more faiaUelic markers in individuals who express said detectable 
trait and individuals who do not express said detectable trait; and 

identifying one or more btalleltc markers which ere statistically associated with the expression of said 
detectable trait. 

123. A method of iitentifying a haplotypo associated whh a trait comprising the steps of: 
obtairnng nudeic acid samples from trait positive and umt negative iodividaaU; 
conducting ao ampUfication ruction on utd nudeic acid samples using ampfiftcation primers capable of 
generating ampCricatlon products containing the polymorphic bases of a plurality of bialldic markers; 
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contacting one or more arrays according to Claim 84 with said amplification products; 
dctcrmtning the identities of the polyniorphic bases of said andptification products: and 
identilying a haplotypc having a statistically sionificant nssocialton with said trait. 

1 24. A method of identifying a haptotype associated with a trait coin]usiii(; the steps of: 
obtaining nucleic acid samples from trait positivo and irntt negalivf: individuals; 

conducting amplification reactions on said nucleic acid samples using «inf)lification primors copablo of 
gonorating amplification products containtng the polymorphic bases of a plurality of biatlctic markers; 

contacting one or more arrays according to Claim 88 with said amplification products; 

conducting miaosequoncing reactions on said amplification products using microscquencing primurs gm said 
arrays, thereby generatino cfongatcd mfcrosequcncing primers comptismg the pofymorphic bases of said amplification 
products; 

determining the identities of said polymorphic bases; and 

identifying a haplotype having a statistically significant association with said trait. 

125. A method of identifying i haplotypc assoaaicd with a trait comprising the steps of: 
obtaining nucleic acid samples from trait positive and trait negative individuals; 

conducting amplification reactions on said nucleic acid samples uisng amplification primors which arc capable 
of generating amplification products containing the polymorphic bases of a plurality of biallcltc markers: 

conducting microsoquencing reactions on said nucleic acid samples* thereby generating microsequencing 
products containing the polymorphic bases of one or more biallcfic matkers at their 2' ends, said polymorpliic bases being 
detectably labeled; 

contactiiig one or more arrays according to Claim 92 with said niiaoseriuencing products such that said 
micfoscqucncmg products spccincaDy hybridize to said nucleic acids complementory to said microsequencing primers; 

determin'mg the identities of the polymorphic bases of said miaosequcncing products; and 

identifying a haplotype having a statistically significant assodatton with said trait. 

126. A method of identifying a haplotype associated with e trait comprising the steps ol: 

obtaining ntideic add samples from trait positive and uait negative individuals; 

coniactiog one or more arrays according to Claim 86 with said nucleic acid sample; 

conducting an ampCfication reaction on said nucleic acid samples using ampfification primers an said array 
which are capable of genorating amplification products containing the polymorphic bases of o plurality of biallctic 
markers; 

determining the idootities of the {wlymorphic basos of said amprtfication products; and 
identifying a haplotype having i statistically significant association with said trait. 
127. A method of determining whathor an rndtvidoal is at risk of developing Ahhetmer's disease or whether 
the Individual suffers from Aliheimer's ifiseasa as a result of posscss'mg the Apo E €4 Site A allele comprising: 
obtaining a nucleic acid aample from said individual: and 
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determining the identity of Uic polymorphic bass in one or more of the sequences solcctod from the group 
conslstinQ of SEQ 10 No:. 301-305 arul SEQ ID Nos. 307-31 1 or the sequences complumentarY llioroto in said nucleic 
acid sample 

120. The method of Claim 127. further comprising d(!lerniii\inQ whothor said nucleic acid sample cniitnms 
5 the sequence of SEQ tO No. 308 or the sequence camplcmontarv thereto. 

129. The mcllwd of Claim 127, witcrdti said step of dotormining the identity of the pulyinu/phtc b:iscs in 
one or more of the sequences selected from the Qrottp consisting of SEQ 10 Nos. 301-305 and SEQ 10 Nos. :i[17 :f1 1 or 
the soqucnccs complofnentary (hereto comprises determiiiing wheihcr said nucleic acid sample contains the scquoncc of 
SEQ ID NO: 31 1 or the sequence complement ary thereto. 
10 130. The mcthoti of Claim 129, further comprising determining whether said nucleic acid sample cutitains 

The sequence of SEQ (0 No. 306 or the sequence compleminiary thereto. 

131. An tsolaled nucleic aciii comprising a sequence selected from the group consisting of SEO ID No. 
301, SEQ ID No. 307« the sequences complementarY thereto, and fragments comprising at least B cnnsccutivo 
nucleotides, including the polymorphic nucleotide, thereof. 
15 132. An isolated nucleic acid comprising i siquencc selected from the group consisting of SEQ ID Nu. 302 

, SEQ 10 No. 306. the scqucncos complementary tficrcto, and fragments comprising at tcast 0 consecutive nucleotides 
thereof. 

133. An isolated nudoic acid comprising a sequence selected from the group consisting of SEO 10 No. 
303, SEQ ID No. 309. the scqucncos complementary thereto, and fragments comprisinQ at least 6 consecutive 

20 nucleotides, including the polymorphic nuclootide. thorcof. 

1 34. An isolated nucleic add comprising a sequence selected from the group consisting of SEQ lU No. 
3Q4, SEQ ID No. 310 « the sequcxas complementary thereto, and fragments comprising at least 6 consecutive 
nticleotidss. including the polymerphic nucleotide, thoraof. 

135. An isolated nudoic acid comprismg a sequence sdected from the group consisting of SEQ ID No. 
25 305, SEQ 10 No. 3t1, the uquences complementary thereto, and fragments comprising at least 8 consecutive 

nudeotiiies, kidudtng the polYmorphic nucleotide, thiriof. 

136. An isolated mtdotc add comprising a sequinci selected from the group consisting of SEQ fO Nos. 
313-317, SEQ 10 Nos. 319-323, and fragments comprising at least 8 consecutive nucleotides thmeof. 

137. An isolated nudeic acid comprising a sequence selected from the group consisting of SEQ 10 Nos. 
30 325-329, SEQ ID Nos. 331-335, the sequence complementary thereto, and fragments comprising at least 8 consccutivo 

nucleotides thereof. 

138. An set of nudoic adds comprising at least 8 consecutive nudeotides. including the polymorphic 
nudeotido. of one or more bialtelic markers obtained by tho method of Claim 1. 

139. A set of nucleic adds comprising ampGftcation prvnenE for generating an ampriftcation product 
35 comprismg at least 8 consttuthro nucleotides, induding the polymorphic nucleotide, of one or more biatlclic markers 

obtained fay the method of Claim 1. 
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140. A set of nucleic acids comprisinu one or moro microssqusncino primers for dutRrmining the idontity of 
(ho polymorphic base of one or mors nucleic acids comprising ot least 8 consecutive nudcotidus, including the 
potymorphic nucleotide, of one or inoic biallolic markers obtained by ttio method of Cldim 1. 
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Figure 1 
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PROSTATE CANCER HAJ»LOTYPE SIMULATIONS (100 ITERATIONS) 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: GENSET SA 

(B) STREET: 24, RUE ROYALE 

(C) CITY: PARIS 

(E) COUNTRY: FRANCE 

(F) POSTAL CODE (ZIP): 75008 



(ii) TITLE OF INVENTION: Diallelic markers for use in constructing 
high density disequilibrium map of the human genome. 

(iii) NUMBER OF SEQUENCES: 336 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy Disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: Win95 

(D) SOFTWARE: Word 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2103-270 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2103-270 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2103-270 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
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CTTGGATTCA TATGAGACAG CTAGCAGACC TTCAATTTTT CTACACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 
{D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2228-301 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2228-301 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2228-301 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



CCCTGCTTAT CCCTGTAAGG TGGAGACCCA TATGGGCAAG GCCAGAC 4 7 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2229-240 

(B) LOCATION: 1..47 
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(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2229-240 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2229-240 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



TCGTCATCGT GGCCTGGGCT ACAGACTACC TGTTCCAGTC CTTCCAG " 4 7 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:<) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2240-281 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2240-281 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2240-281 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



GCAATCTTAA TAACTTTTTA TTTCAGTAAT TCGAATCTTT TTTTTCT 



47 
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(2) INFORMATION FOR SZQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2242-206 
{B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2242-206 

(B) LOCATION: 1 . . 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2242-206 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GTGTTTTCTT TTAGTCAAAT TATCTTATAT TTTACTTTTT TCTTAAG 4 7 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2244-63 

(B) LOCATION: 1..47 



(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 
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(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2244-83 

(B) LOCATION: 1..23 

(ix) FEATURE; 

(A) NAME/KEY: Potential microsequencing oligo 99-2244-83 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



TAATTGTAGA TACTAAGACC ATTATGCTTA AACCATGTAG GTACTGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2246-340 
{B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2246-340 

(B) LOCATION: l.,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2246-340 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



ATTTATATGT TAAATGCAGA GAAAAAGAAA AATAAGTTTT GCAGTAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 8: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(iN) FEATURE: 

(A) NAME/KEY: poXymorphic fragment 99-2218-7G 
ib) LOCATION: l..'!? 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2246-76 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2248-76 
(D) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



GACAGAGAGG GAAGGTAATC TTCCCCTGAA GTCTGCCCAT CCCCTGG 4 7 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2250-236 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2250-236 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2250-236 

(D) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



ATGTATCCAA AACAGAATTA ACACACTTTG GGTTTTTTAT TTTTATT Al 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2251-151 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2251-151 

(B) LOCATION: 1..23 

(ix)' FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2251-151 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



TGAAAAGAAG TTCAGACGAT TGCAGATAGA CTAGTTTGGC TGTTGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2269-179 
(D) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2269-179 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2269-179 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AAAATAAAGA AATTCCTAGA GACATACAGC CTATCAAGAT CAAACCA 4 7 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2271-403 
. (B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2271-403 

(B) LOCATION: 1. .23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2271-403 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
AGGCATTTAT TTCATATTTA TTAACCTTGA TTTTCTTATC TTCAAGT 4 7 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2272-409 
(D) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2272-409 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2272-409 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
AAAAGCACTG CAATTATTTT GGAGACTGTG AAATATTGCA AGTTTTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2273-528 
(D) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 2A 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2273-520 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2273-528 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



ACTTGAAGAT AAGAAAATCA AGGCTAATAA ATATGAAATA AATGCCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2275-466 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2275-466 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2275-466 
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(B) LOCATION: complement 25.. 47 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



TTGATGATAG CATTAAATAC TCCCAAAAAC TGTGAATAGG GATACTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2278-276 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencmq oligo 99-2278-276 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2278-276 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



GAAAAAAATG GGAACATCTT CACAGCCTGT GCATCTCCAA CAAGATT 4 7 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2312*358 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2312-308 

(B) LOCATION: 1 . . 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2312-358 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



TTGAAGAGAG AGATGGAAAA AAACGTAGGC CTTCTGGGTA AATGGCC 4 7 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE; 

(A) NAME/KEY: polymorphic fragment 99-2315-213 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2315-213 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2315-213 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
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AGATGGATTC TACCCACAGG CAAAAGAAAA CCTTATTTTA AAAATAA 4 7 



{2) INFORMATION FOR SEQ ID NO: 19: 

(i) riEOUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2320-292 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2320-292 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2320-292 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



ACTCTCATTC ACTAAACTTC AACCGTTTTT ATAAATTTAA TGAATTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 20; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

. (B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 
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(A) NAME/KEY: polymorphic fragment 99-2321-82 

(B) LOCATION: 1..47 

{ix) FEATURE: 

{A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY; Potential microscquencing oligo 99-2321-82 

(D) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microscquencing oligo 99-2321-02 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



TAAAGCTTAC TGAGTGTCCA CTCCGGATAC CTACTCAAAT ATTTCCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2324-338 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2324-338 

(B) LOCATION: l.,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2324-338 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



AGATAGAAGA CAAAATCGCA GGAAAAGAAA TCCCTCAACA GTAAAAA 



47 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: M base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:<) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2333-423 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2333-423 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2333-423 

(B) LOCATION: complement 25., 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



GAGACGCTAT CTATGCAAGG AGGGTGTTCA ACATTTGGAC AGCCACG 4 7 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2341-485 

(B) LOCATION: 1..47 
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(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequcncing oligo 99-2341-405 
(D) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequcncing oligo 99-2341-485 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



ACACATCTGT CTGTTACCTA CACCTTACAA AGAATCGCAC AGGCTCT Al 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2302-217 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2342-217 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2342-217 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



TAGAGCCTTG GACTTTCATG ACACTTCTAG AAACAGCCCA GATTGTG 



47 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2362-270 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencinq oligo 99-2362-270 

(B) LOCATION: 1..23 

{i:<) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2362-270 

(B) LOCATION: complement 25.. 47 

{:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TCTCTCTTGG GTGGTTCCTC AACATGTGTG ACCTTGACCA AGTATTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vij ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2364-329 

(B) LOCATION; 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 
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(D) OTHER INFORMATION: base g 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-236/1-329 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-23G'J-329 
(0) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
ATATAAAATG ATGAACCATA TACGTGAGGC AAGGTAACAT ATAATTG Al 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2367-61 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2367-61 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2367-61 

(B) LOCATION: complement 25.. 47 

(xi). SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TAAACATTTC ATTATTTCAG AAAATAATAT GCATTTTCAC CAACACA 4 7 



(2) INFORMATION FOR SEQ ID NO: 28: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2371-93 
(D) LOCATION: 1..47 

(i.x) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2371-93 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2371-93 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
CTCTAAACTT TCCTAATACT TACATCACTG CCTACTTTTT ACATAAT 4 7 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) . FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2378-200 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 



(ix) FEATURE: 
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(A) NAME/KEY: Potential microsequencing oligo 99-2378-200 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2378-200 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 



GAGAACTTCC TCTTGAACCT GTTATAGAAC TGTCCTGTCG TCCAAGA 47 



(2) INFORMATION FOR SEO ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2381-394 
{B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2381-394 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2381-394 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



AGTGGTCTTC AGGTTATTGG TAGAGAAAAG TAGGGGAGCT AAAGGTG 47 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 



wo 99/04038 



21 



PCT/IB98/01193 



(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2413-368 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(i.x) FEATURE: 

(A) NAME/KEY: Potential microsequencing oliqo 99-2413-368 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2413-368 

(B) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



ATTTTAAGAG GAAAACTTAA TGGAAGAATT GTACATAATA TTTCATT 4 7 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2419-285 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2419-285 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2419-285 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
AAGGGATCAA GCAGTGCCCA CTCCCCACCC TCCAGGGAGC TGTGACT A'f 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(3) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
O) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2559-253 
(3) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 
(3) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2559-253 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2559-253 
(3) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
CAGGTGTTTT CATGCCCTCT TAGGGTGTGT CACATCATCC ATCTCAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2566-112 
(D) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-25G6-112 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2566-112 

(B) LOCATION: complement 25.. 4*? 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GCCTTCACAA CCGCAGAGGC AAGAGAAGGA GCTTGGCCAC CCTGACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(iil MOLECULE TYPE: DMA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2567-329 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 
• (B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2567-329 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2567-329 

(B) LOCATION: complement 25.. 47 
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(Ni) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
CACTGTCAGA TATGAAATGA TGCGTGGCTT TCTTTGGGCT ATATTTG 4 7 



(2) INTORMATION FOR SEQ ID NO: 36: 

(i) liEOUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:<) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2570-218 

(B) LOCATION: 1..4 7 

{i:<) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(i:<) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2570-218 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2570-218 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



GGAAAGTTCC AAATTATGAG AAGCGAGGCC TCTGAAGTGG CTAAGTT 47 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) - SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2571-242 

(B) LOCATION: 1..4 7 

Ux) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2571-242 

(D) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2571-242 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 



ATAATGAATG AGTATTTGAT ATTATATAAT TAAATGTGTC AGCATTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2610-121 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2610-121 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo_99-2610-121 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: - 
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ATACCCCTTC CCTAGGTATG GCTATATGCT GCACTTAGAA AATTCTC 4 7 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANOEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2616-83 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2615-83 
IB) LOCATION: 1,.23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-261S-B3 

(B) LOCATION: complement 25.. 47 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



AACAAATCAC AAGTTGGCAA AAGCAGCAAA TTCTCATCTT CTGGGAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANOEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2620-227 
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(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2C20-227 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2G20-227 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



TTGACTGGGC TCCTGATGTG TCCAGGGTAT CTTGCTGGCT GTTTTGC 4 7 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2624-407 

(B) LOCATION: 1..44 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2624-407 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2624-407 

(B) LOCATION: complement 25,, 44 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 



ATCTGGCCAT AGGCAGAACA TTGGGGGAGA GATGGGGAAA GAGA 
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(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNES5: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2625-70 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2625-70 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2625-70 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 



AGTGACTCAA CCAGAAAGAG AGCAGGAGAG AGGACGAAGA GAGGAGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2630-67 

(B) LOCATION: 1..47 

(ix) FEATURE: 
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(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencinq oligo 99-2630-67 
(D) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2C30-67 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 



TAAATTCTGC CTAGAAGATT AAGATTGGTC CAGAACAGGG AGTGTTT ^7 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS; SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2633-129 

(B) LOCATION: 1..47 

(ix) FEATURE; 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencinq oligo 99-2633-129 

(B) LOCATION: 1_23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencinq oligo 99-2633-129 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



TAGCTATTTC TTCCCCTAGG CAAAGTAGAC AATGAGAGAA CCCTTGA ^ 4 7 



(2) INFORMATION FOR SEQ ID NO: 45: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2G34-341 

(B) LOCATION: 1. .47 

{ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2634-341 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2634-341 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
GGAATCAATA TTTATTTATT ATCAACAGGT GAGACATTAT TTATTTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

<B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2637-28 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2637-28 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2637-28 

(B) LOCATION: complement 25.. ^17 

(xi) SEQUENCE DEIJCRIPTION : SEQ ID NO: 46: 
CCATCACTTC CTCCTAGTGA AAAATCAAAG GAGGGTGGGT TTTATAG 4 7 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2642-255 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2642-255 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2642-255 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
TGAGGGTGTT TCCAGAAGAG ACTAGCATTT GAATCTGAAG TGAGTAA 47 



(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 
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(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2645-118 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2645-118 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2645-118 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
CACAAATTAA TTGCATTGTT ATAGGCTAGC AATGAAGAAT CTGAAAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2647-368 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 



(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2647-368 
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(B) LOCATION: 1..23 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oliqo 99-2647-368 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



TTAAGGCCTT CAACTGATTA GACAAGGCCC ACTCACATTA TCTGACA 4 7 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(iK) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2649-107 
(D) LOCATION: 1..47 

(iv) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2649-107 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2649-107 

(B) LOCATION: complement 25., 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 



CACAACTCTG GAGCCTTTTA TGAACAGGAC AGCAATGCAC TGAAACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



wo 99/04038 



34 



PCT/IB98/0n93 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2103-270 
(H) LOCATION: 1. .47 

(0) OTHER INFORMATION: variant version of SEQ IDl 

ixx) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 24 

(U) OTHER INFORMATION: base c; g in SEQ IDl 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2103-270 

(B) LOCATION: 1..23 

(i:<) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2103-270 

(B) LOCATION: complement 25.. 47 

(.xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



CTTGGATTCA TATGAGACAG CTACCAGACC TTCAATTTTT CTACACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2228-301 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID2 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID2 . 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2228-301 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2228-301 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



CCCTGCTTAT CCCTGTAAGG TGGGGACCCA TATGGGCAAG GCCAGAC O 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

Ux) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2229-240 

(B) LOCATION: 1..4 7 

(D) OTHER INFORMATION: variant version of SEQ ID3 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID3 
{ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2229-240 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2229-240 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



TCGTCATCGT GGCCTGGGCT ACATACTACC TGTTCCAGTC CTTCCAG 4 7 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2240-201 
(D) LOCATION: 1,.4 7 

(D) OTHER INFORMATION: variant version of 5EQ ID4 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID^ 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oliqo 99-2240-291 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2240-281 

(B) LOCATION: complement 25.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 



GCAATCTTAA TAACTTTTTA TTTTAGTAAT TCGAATCTTT TTTTTCT 4 7 



(21 INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
to STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2242-206 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ IDS 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID5 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2242-206 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2242-206 
(D) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



GTGTTTTCTT TTAGTCAAAT TATTTTATAT TTTACTTTTT TCTTAAG 4 7 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2244-83 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID6 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID6 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2244-83 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2244-83 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 



TAATTGTAGA TACTAAGACC ATTGTGCTTA AACCATGTAG GTACTGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-22^0-340 
(R) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID7 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2246-340 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2246-340 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 



ATTTATATGT TAAATGCAGA GAAGAAGAAA AATAAGTTTT GCAGTAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2248-76 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ IDS 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(8) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ IDS 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2248-76 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2248-76 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 



GACAGAGACG GAAGGTAATC TTCTCCTGAA GTCTGCCCAT CCCCTGG 4 7 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2250-236 

(B) LOCATION: 1..4 7 

(D) OTHER INFORMATION: variant version of SEQ ID9 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID9 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2250-236 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2250-236 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



ATGTATCCAA AACAGAATTA ACATACTTTG GGTTTTTTAT TTTTATT 47 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2251-151 
(D) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version ot SEQ IDIO 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEO IDIO 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2251-151 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2251-151 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 



TGAAAAGAAG TTCAGACGAT TGCGGATAGA CTAGTTTGGC TGTTGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2269-179 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ IDll 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ IDll 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2269-179 

(B) LOCATION: 1. .23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2269-179 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 



AAAATAAAGA AATTCCTAGA GACGTACAGC CTATCAAGAT CAAACCA 4 7 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2271-403 
(D) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID12 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(-D) OTHER INFORMATION: base g; a in SEQ ID12 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2271-403 

(B) LOCATION: 1,.23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2271-403 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 



AGGCATTTAT TTCATATTTA TTAGCCTTGA TTTTCTTATC TTCAAGT 47 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE. CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ixl FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2272-409 
(D) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID13 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID13 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2272-409 

(D) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2272-409 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 



AAAAGCACTG CAATTATTTT GGATACTGTG AAATATTGCA AGTTTTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 64: 

{i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-227 3-528 

(B) LOCATION: 1..47 

- (D) OTHER INFORMATION; variant version of SEQ ID14 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID14 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2273-528 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2273-528 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 



ACTTGAAGAT AAGAAAATCA AGGTTAATAA ATATGAAATA AATGCCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2275-466 

(B) LOCATION: l.,47 

(D) OTHER INFORMATION: variant version of SEQ ID15 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID15 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2275-466 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2275-466 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 



TTGATGATAG CATTAAATAC TCCTAAAAAC TGTGAATAGG GATACTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 90-227Q-27G 
(U) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID16 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID16 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2278-276 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2278-276 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 



GAAAAAAATG GGAACATCTT CACGGCCTGT GCATCTCCAA CAAGATT 4 7 



{2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2312-358 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID17 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID17 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2312-358 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2312-358 

(B) LOCATION: complement 25.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 



TTGAAGAGAG AGATGGAAAA AAATGTAGGC CTTCTGGGTA AATGGCC 4 7 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2315-213 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID18 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID18 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2315-213 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2315-213 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 



AGATGGATTC TACCCACAGG CAAGAGAAAA CCTTATTTTA AAAATAA 47 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2320-292 
(D) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID19 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID19 
(i:<) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2320-292 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2320-292 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



ACTCTCATTC ACTAAACTTC AACTGTTTTT ATAAATTTAA TGAATTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2321-82 

(B) LOCATION: l.,47 

(D) OTHER INFORMATION: variant version of SEQ ID20 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID20 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2321-82 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2321-82 

(B) LOCATION: complement 25.. 47 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 



TAAAGCTTAC TGAGTGTCCA CTCTGGATAC CTACTCAAAT ATTTCCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2324-338 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID21 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID21 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2324-338 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2324-338 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 



AGATAGAAGA CAAAATCGCA GGACAAGAAA TCCCTCAACA GTAAAAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

tvi) ORIGINAL SOURCE; 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2333-423 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID22 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEO ID22 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2333-423 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2333-423 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 



GAGACGCTAT CTATGCAAGG AGGTTGTTCA ACATTTGGAC AGCCACG 4 7 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2341-485 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID23 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID23 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2341-485 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2341-485 
{B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 



ACACATCTGT CTGTTACCTA CACTTTACAA AGAATCGCAC AGGCTCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2342-217 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID24 

(i:<) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID24 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2342-217 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2342-217 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 



TAGAGCCTTG GACTTTCATG ACATTTCTAG AAACAGCCCA GATTGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2362-270 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID25 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID25 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2362-270 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2362-270 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 



TCTCTCTTGG GTGGTTCCTC AACGTGTGTG ACCTTGACCA AGTATTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2364-329 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID26 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; g in SEQ ID26 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2364-329 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2364-329 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 



ATATAAAATG ATGAACCATA TACCTGAGGC AAGGTAACAT ATAATTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2367-61 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID27 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID27 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2367-61 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2367-61 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 



TAAACATTTC ATTATTTCAG AAAGTAATAT GCATTTTCAC CAACACA 47 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2371-93 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID28 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID28 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2371-93 

(B) LOCATION: l.,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2371-93 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 



CTCTAAACTT TCCTAATACT TACCTCACTG CCTACTTTTT ACATAAT 4 7 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2378-200 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID29 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID29 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2378-200 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2378-200 

(B) LOCATION: complement 25.. 47 

t.xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 



GAGAACTTCC TGTTGAACCT GTTGTAGAAC TGTCCTGTCG TCCAAGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2381-394 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID30 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 
(D) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID30 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2381-394 

(B) LOCATION: 1..23 

(ix) FEATURE; 

(A) NAME/KEY: Potential microsequencing oligo 99-2381-394 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 



AGTGGTCTTC AGGTTATTGG TAGGGAAAAG TAGGGGAGCT AAAGGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2413-368 

(B) LOCATION: l.,47 

(D) OTHER INFORMATION: variant version of SEQ ID31 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ 1031 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2413-368 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2413-368 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 



ATTTTAAGAG GAAAACTTAA TGGGAGAATT GTACATAATA TTTCATT 4 7 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
(CI STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2419-285 

(E) LOCATION: 1..47 

(b) OTHER INFORMATION: variant version of SEQ ID32 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION; 24 

(D) OTHER INFORMATION: base t; c in SEQ ID32 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2419-285 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2419-285 

(B) LOCATION: complement 25.. 47 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 



AAGGGATCAA GCAGTGCCCA CTCTCCACCC TCCAGGGAGC TGTGACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2559-253 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID33 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID33 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2559-253 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2559-253 

(B) LOCATION: complement 25., 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 



CAGGTGTTTT CATGCCCTCT TAGTGTGTGT CACATCATCC ATCTCAA 47 



(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2566-112 

(B) LOCATION: 1..47 

(0) OTHER INFORMATION: variant version of GEO ID3'1 

Ux) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID34 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2566-112 

(B) LOCATION: I.. 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-256G-112 
tB) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 



GCCTTCACAA CCGCAGAGGC AAGGGAAGGA GCTTGGCCAC CCTGACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANOEONESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2567-329 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID35 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID35 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2567-329 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential mxcrosequencing oligo 99-2567-329 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 



CACTGTCAGA TATGAAATGA TGCTTGGCTT TCTTTGGGCT ATATTTG 47 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE; NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2570-218 

(B) LOCATION: 1. .47 

(D) OTHER INFORMATION: variant version of SEQ ID36 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID36 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2570-218 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2570-218 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 



GGAAAGTTCC AAATTATGAG AAGTGAGGCC TCTGAAGTGG CTAAGTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2571-242 
(D) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID37 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID37 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2571-242 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2571-242 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 



ATAATGAATG AGTATTTGAT ATTGTATAAT TAAATGTGTC AGCATTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANOEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2610-121 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ 1036 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID38 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2610-121 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2610-121 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 



ATACCCCTTC CCTAGGTATG GCTCTATGCT GCACTTAGAA AATTCTC 4 7 



(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2615-83 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID39 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(E) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID39 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2615-83 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2615-83 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 



AACAAATCAC AAGTTGGCAA AAGTAGCAAA TTCTCATCTT CTGGGAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2620-227 
(D) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID40 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID40 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2620-227 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2620-227 
(D) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 



TTGACTGGGC TCCTGATGTG TCCGGGGTAT CTTGCTGGCT GTTTTGC 4 7 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2624-407 

(B) LOCATION: l.,44 

• (D) OTHER INFORMATION: variant version of SEQ ID41 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID41 _ 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2624-407 

(B) LOCATION: 1,.23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2624-407 

(B) LOCATION: complement 2 5.. 4 4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 



ATCTGGCCAT AGGCAGAACA TTGTGGGAGA GATGGGGAAA GAGA 4 4 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE; 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2625-70 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID42 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 
(D) LOCATION: 24 

(D.) OTHER INFORMATION: base g; a in SEQ ID42 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2625-70 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2625-70 

(B) LOCATION: complement 25., 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 



AGTGACTCAA CCAGAAAGAG AGCGGGAGAG AGGACGAAGA GAGGAGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2630-67 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID'IS 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID4 3 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2630-67 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2630-67 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 



TAAATTCTGC CTAGAAGATT AAGGTTGGTC CAGAACAGGG AGTGTTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2633-129 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID44 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID44 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2633-129 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2633-129 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 



TAGCTATTTC TTCCCCTAGG CAACGTAGAC AATGAGAGAA CCCTTGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2634-341 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID45 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID4 5 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2634-341 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2634-341 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 



GGAATCAATA TTTATTTATT ATCGACAGGT GAGACATTAT TTATTTA 47 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
IC) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 



wo 99/04038 



64 



PCT/IB98/01193 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2637-2U 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of 5E0 ID46 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID46 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2637-28 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2637-28 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96; 



CCATCACTTC CTCCTAGTGA AAAGTCAAAG GAGGGTGGGT TTTATAG 4 7 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANOEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2642-255 

(B) LOCATION: 1..47 

. (D) OTHER INFORMATION: variant version of SEQ ID47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID47 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2642-255 

(B) LOCATION: l.,23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2642-255 

(B) LOCATION: complement 25.. 47 

{Hi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 



TGAGGGTGTT TCCAGAAGAG ACTGGCATTT GAATCTGAAG TGAGTAA 4 7 



{2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ix) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2645-118 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID48 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID4 8 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2645-118 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2645-118 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 



CACAAATTAA TTGCATTGTT ATATGCTAGC AATGAAGAAT CTGAAAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-26^7-368 
(D) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID49 

(i:<) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID49 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2647-368 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2647-368 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 99: 



TTAAGGCCTT CAACTGATTA GACGAGGCCC ACTCACATTA TCTGACA 4 7 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-264 9-107 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID50 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; a in SEQ ID50 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-264 9-107 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2649-107 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 



CACAACTCTG GAGCCTTTTA TGATCAGGAC AGCAATGCAC TGAAACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 101: 

(.i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ IDl, SEQ ID51 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 



CCTGGATTCT GACCCATC 18 



(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID2, SEQ ID52 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 



TCTACCTCTA CCTCTTTC 



18 
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(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID3, SEQ ID53 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 



CTTCCCATAC CTCTGATAC 19 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(O) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID4, SEQ ID54 

(B) LOCATION: 1 . . 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 



TTCAACAGTG AAGCCATC 18 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEO IDS, SEO ID55 
(D) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 



TGATGTGTGT GACTCAGG 18 



(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID6, SEQ IDD6 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 



ATAGAGGAAC CAAACCTG 18 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID7, SEQ ID57 
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(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 



AGCAGCATGG AAGCAAAC 18 



{2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ IDS, SEQ ID58 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 



CTGATGAAAG TGGCTCTC 18 



(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID9, SEQ ID59 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 



TGTATCTGAG GTCTAAAAC 



19 



wo 99/04038 



71 



PCT/IB98/0n93 



(2) INFORMATION FOR SEQ ID NO; 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNEGS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ IDIO, SEO IDGO 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 



TATATGTAGA GGGTGAGG 18 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

[ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ IDll, SEQ ID61 

(B) LOCATION: l.,19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 



AGGCTAAGAA AAAAAGAGG 19 



(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer tor SEC ID12, SEQ ID62 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 



TGAAAAGACT AAGTTCTGG 19 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 
iC) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID13, SEQ ID63 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 



ATGCTAGAGG AAAGGAAC 18 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) - MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID14, SEQ ID64 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 
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ATACCAGGGA CTTTAGTG 18 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID15, SEQ ID65 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 
AGATTCAGAC CAATTTCAC 19 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A^ LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID16, SEQ ID66 

(B) LOCATION: 1..18 

(xi)- SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
TGCTTTGATT TGACCCTG 18 



(2) INFORMATION FOR SEQ ID NO: 117: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY; upstream amplification primer for SEO ID17, SEO ID67 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 



GCCTATCTTG TTTTGACTG 19 



{2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID18, SEQ ID68 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 



TTCAGAGCAA CAATTTTGG 19 



(2) INFORMATION FOR SEQ ID NO: 119: 

(i) . SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID19, SEQ ID69 

(B) LOCATION: 1 . . 20 

(Ki) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 



CCAAGTTTAT GAGATTAGAG 20 



(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID20, SEQ ID70 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 



CTAACCTAGA TGATCTTCC 19 



(2) INF0Rt4ATI0N FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID21, SEQ ID71 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 



TGTCCCAAGT TTAGTTCC 



18 
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(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID22, SEQ ID7 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 



CCAGGAATAA TACTTTGCAT C 21 



(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID23, SEQ ID7 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 



CTCAGTTTTT CTTTCCACC 19 



(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEO ID24, SEQ ID7/1 

(B) LOCATION: 1. ,20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 



GACTCAGGCA CAACTTTTAG 20 



(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEO ID25, SEQ ID75 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 



TACAGCAATG GTATAAAGC 19 



(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID26, SEQ ID76 
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(B) LOCATION: 1..20 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 



TTATCCATCA TTTAGAAGGC 20 



(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID27, SEQ ID77 

(B) LOCATION: 1 . . 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 



CACTGGAGAT AGCTGAAC 18 



(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
to STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID28, SEQ ID78 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 



GTACTGTCAA ATCATCACC 



19 
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(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNE3S: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID29, SEQ ID79 
(D) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 



CGGGCATAAA AATGCAGG 18 



(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID30, SEQ ID80 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 



GTATATGTGA AGGTTGTGGG 20 



(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



wo 99/04038 



30 



PCT/IB98/0n93 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID31, SEO IDOl 

(B) LOCATION: I. . 19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 



GTAACATGTG ACTTGCTCC . 19 



(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEO ID32, SEQ ID82 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 



CCAGCTTGAA TTTTGGTGAG 20 



(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) ' MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID33, SEQ ID83 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 
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GCATATCTTG GTGGTCTG 18 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM; Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID31, SEQ IDS-i 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 
AGGGTTCAAA GGAAGGAGG 19 



(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID35, SEQ ID85 

(B) LOCATION: 1..20 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 135: 
GAAAAAGAAG GGAAAGAAAG 20 



(2) INFORMATION FOR SEQ ID NO: 136: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID3G, SEO ID8G 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 



GTTTGTCTTG GCTATTAAG 19 



(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID37, SEQ ID87 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 



TGAAAAAGTG GGTAGCAG 18 



(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID38, SEQ ID98 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 



ATATCAGGGC AGGCACAAG 19 



(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID39, SEQ ID89 

(B) LOCATION: 1. . 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 



GGAAGAGGGC AACTTTAC 18 



(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID40, SEQ ID90 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 



TGAAATGGGC TGTAGATG 



18 
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(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: ONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID41, SEQ 1091 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 



TTA.=ACCTTG GCTTCCTG 10 



(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 
{C} STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

Hi) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID42, SEQ ID92 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 



TTCAACCTTT TGTCGCTG 18 



(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for 5E0 ID43, SEQ ID93 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 



ATGTAACAGA TGTCCAAAG 19 



(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID44, SEQ ID94 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 



CTAAGGGTCT TCTTTCTG 18 



(2) INFORMATION FOR SEQ ID NO: 14 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID45, SEQ ID95 
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(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

GGTGTATTTA GGTTTGTGG 19 

{2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(iil MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID46, SEQ ID96 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 
CTACCATCAC TTTCCTCC 18 



(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

to STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID47, SEQ ID97 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 
ATAACTAGGC ATCCAGAC * 18 
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(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID1Q, SEQ ID98 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 



CGACATAATT TGGTATGTAG 20 



(2} INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID49, SEQ ID99 

(B) LOCATION: 1 . . 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 



TCACCAAGTG TCATCGTC 18 



(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for GEQ ID50, SEQ 

IDIOO 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEO ID NO: 150: 

GAGACTTTGT AACTTTGTG 19 

(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID51 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEO IDl, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 



GTCTTCATAA GTCTTCAGTG 20 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID2, SEQ 

ID52 
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(B) LOCATION: 1_L8 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

CAAAACACTC CCTCACAC 18 



(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A) ORGANISM: Homo sapiens 

lix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ IDS, SEQ 

(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 



ID53 



CAGGTGATGT CTGGATAC 18 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID54 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 



AAGACAACAA GAACTAAATC C 21 
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(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID55 



tix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ IDS, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 



TCCCCAATAG ATTAAAGTTC 20 



(2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID56 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID6, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 



CTGAGCATCA AATAGGAG 18 



(2) INFORMATION FOR SEQ ID NO: 157: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for 5E0 ID7, SEQ 
(D) LOCATION: 1..21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 



TCATTACAGA AAAAGCCAAA G 21 



ID57 



(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID58 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ IDG, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 



TCCTTCTCCA CCTAAAATTC 20 



(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID9, SEQ 

ID59 

(D) LOCATION: 1. .18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 



ACTGCTTCTG CTCTCTTG 18 



(2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ IDIO, SEQ 

(B) LOCATION: 1. .20 
(XX) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 



ID60 



TGAACATACA f^J^AAClKCTGG 20 



(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer ior SEQ IDll, SEQ 

ID61 

(B) LOCATION: 1..18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 
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AGAGTTGTTG GCATGTAG 18 



(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: ONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID12, SEQ 

(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 



ID62 



AACTGCTCAG CAACTGTG 18 



(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 
IC) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID63 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID13, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 



TTAGAACACT TTTATGGGAA C .21 



(2) INFORMATION FOR SEQ ID NO: 164: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID14, SEQ 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 



GTCCTAGAAT GAGCAAATG 19 



ID64 



(2) INFORMATION FOR SEQ ID NO: 165: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID65 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID15, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 



AGAGAAAGAA CCAGAGCC IB 



(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ IDIC, SEQ 

ID66 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

TGGAGTCTAA ACTAGGTG 18 



(2) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID67 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID17, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 



GGACCTTTTA AGAGTGTG 18 



(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 
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(A) NAME/KEY: downstream amplification primer for SEQ ID18, SEQ 

ID68 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

TGGTTTCTTC AAACAAGAG 19 



(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID19, SEQ 

(B) LOCATION: 1..21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 



ID69 



AAGTTGGATA ACCTTCTTTT G 21 



(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID70 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID20, SEQ 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 
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TAGTTTCGTG AACTTATCC 19 



(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID21, SEQ 

(B) LOCATION: 1..21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 



ID71 



GTTTACATTA TGCCCCTTTT C 21 



(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID72 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID22, SEQ 

(B) LOCATION: 1. .18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 



CTCCACTGCC ACAACTTC 18 



(2) INFORMATION FOR SEQ ID NO: 17 3: 
(i) SEQUENCE CHARACTERISTICS: 
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ID73 



(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID23, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 



TGCTCTGCTT GTAATGTTAT G 21 



(2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID24, SEQ 

(B) LOCATION: 1..19 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 



ID74 



CAAGGTTGCC AGTCACATC 19 



(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



wo 99/04038 



99 



PCT/IB98/01193 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID25, SEQ 

ID75 

(D) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 

ATGAAGATAC GCAGCCAG 18 

(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID76 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID26, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 



CTCATTTAAC TCCCATTCCT C 21 



(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID77 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID27, SEQ 

(B) LOCATION: 1..21 
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(xil SEQUENCE DESCRIPTION: SEQ ID NO: 177: 



TGCTTTTCTT GTCCCTGATT G 21 



(2) INFORMATION FOR SEQ ID NO: 178: 

(.1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID28, SEQ 
(D) LOCATION: 1..20 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 



GCATTGAATC CGTAAATTTC 20 



(2) INFORMATION FOR SEQ ID NO: 179: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID29, SEQ 

ID79 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 



(ix) 

ID78 



CAGTTTTGGT CATTGTGGGA G 



21 
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{2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID80 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEO ID30, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 



AAATCCAACT ATGTCACTTC C 21 



(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID81 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID31/ SEQ 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 



AATGTCCCCT CCTCCTCTG 



(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID32, SEQ 

(B) LOCATION: l.,20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 



GCCACAAGTA TTTGGGTGCC 20 



ID82 



(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID83 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID33, SEQ 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 



CCTACGGTTT GTCATAAAG 19 



(2) INFORMATION FOR SEQ ID NO; 184: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



wo 99/04038 



103 



PCT/IB98/01193 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID34, SEQ 

ID84 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

TGTAACAGGG GACATGGGAA G 21 



(2} INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID35, SEQ 

ID85 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

CAATTTTGTA TGGATGACAG 20 



(2) INFORMATION FOR SEQ ID NO: 186; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



1086 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID36, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 
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TGGTGGTGGA AAAAAAGAAG G 21 



(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID87 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primev for SEQ ID37, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 



CTATAACTCT TATCAGTGAA C 21 



(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID88 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID38, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

AGGTCACTCA AGTATTATGG 20 



(2) INFORMATION FOR SEQ ID NO: 189: 
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ID89 



(i) SEQUENCE CHARACTERISTICS: 

- (A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID39, SEO 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 



CCCCAGCTCC CAAATAATGA C 21 



(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID40, SEQ 

ID90 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

TCCACAACAG ACACTTAAAC 20 



(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 
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(vi") ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID41, SEQ 

ID91 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

TCTCTTTCCC CATCTCTC 18 



(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID42, SEQ 

(B) LOCATION: 1..19 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 



ID92 



TCCCCTTCTA TTGTCTACC 19 



(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4 3, SEQ 

ID93 



wo 99/04038 



107 



PCT/IB98/01193 



(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

GGTTTGTGTT CAGTACGG 18 



(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS:' SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID94 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4 4, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 



TGTATATGCC TGGTGGAAAT G 21 



(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID95 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4 5, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 



GTGAAAGAAA CTTGATAGAG G 



21 
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(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID96 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4 6, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 



CCTCCAACAG TAAGAATC 18 



(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID97 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID47, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

CAGAACCATT AACTATTCAC 20 



(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4 8, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 



GCCATTTGGA ATTTTGATAG 20 
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(2) INFORMATION FOR SEQ ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vil ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID99 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID49, SEQ 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 

TGCAGCATCC CTGGAAGTC 



(2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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{ixh FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID50, SEQ 

IDIOO 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 

GAGACATCAT ATCTGTGTTT G 21 



(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY; microsequencing oiigo 99-2103-270.misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 
GATTCATATG AGACAGCTA 19 



(2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2228-301 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 
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CCCTGCTTAT CCCTGTAAGG TGG 23 



(2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2229-240 , misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: 
TCGTCATCGT GGCCTGGGCT ACA 23 



(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2240-281 .misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 
TCTTAATAAC TTTTTATTT 19 



(2) INFORMATION FOR SEQ ID NO: 205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS : SINGLE 
- (D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-2242-206. misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 



TTTCTTTTAG TCAAATTAT 19 



(2) INFORMATION FOR SEQ ID NO: 206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2244-83 .misl 
(D) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 



TAATTGTAGA TACTAAGACC ATT 23 



(2) INFORMATION FOR SEQ ID NO: 207: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 
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(A) NAME/KEY: potential microsequencing oiigo 99-224 6-340 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 



ATTTATATGT TAAATGCAGA GAA 23 



(2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vii ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2248-76 .misl 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 



GAGAGGGAAG GTAATCTTC 



(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-2250-236 , misl 

(B) LOCATION: I.. 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209: 



TTTTATCCAA AACAGAATTA ACA 



23 
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(2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2251-151 .misl 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 



TGAAAAGAAG TTCAGACGAT TGC 23 



(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2269-179 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 



AAAATAAAGA AATTCCTAGA GAC 23 



(2) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2271--103 .mis! 
(8) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 



AGGCATTTAT TTCATATTTA TTA 23 



(2) INFORMATION FOR SEQ ID NO: 213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2272-'109 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213: 



AAAAGCACTG CAATTATTTT GGA 23 



(2) INFORMATION FOR SEQ ID NO: 214: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
ID) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2273-528 . misl 

(B) LOCATION: 1..19 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 
GAAGATAAGA AAATCAAGG 19 

(2) INFORMATION FOR SEQ ID NO: 215: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2275-4 66 . misl 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 
TGATAGCATT AAATACTCC 19 



(2) INFORMATION FOR SEQ ID NO: 216: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2278-276 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216: 



GAAAAAAATG GGAACATCTT CAC 



23 



(2) INFORMATION FOR SEQ ID NO: 217: 
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(i) SEQUENCE CHAEIACTERISTICS : 

-(A) LENGTH: 23 base pairs 
(B) TYPE: NUCLEIC ACID 
(Ci STRANDEDNESS : SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2312-358 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 



TTTTAGAGAG AGATGGAAAA AAA 2 3 



(2) INFORMATION FOR SEQ ID NO: 218: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2315-2 13 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218: 



AGATGGATTC TACCCACAGG CAA 23 



(25 INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 



wo 99/04038 



113 



PCT/IB98/0n93 



(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2320-292 .misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 



TCATTCACTA AACTTCAAC 19 



(2) INFORMATION FOR SEQ ID NO: 220: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1.9 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE; DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-232 1-82 . mis 1 

(B) LOCATION: l,a9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 



GCTTACTGAG TGTCCACTC 19 



(2) INFORMATION FOR SEQ ID NO: 221: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2324-338 .misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 
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AGAAGACAAA ATCGCAGGA 19 



(2) INFORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2333-423 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222: 
GAGACGCTAT CTATGCAAGG AGG 23 



(2) INFORMATION FOR SEQ ID NO: 223: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-234 1-485 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 
TTTTATCTGT CTGTTACCTA CAC 23 



(2) INFORMATION FOR SEQ ID NO: 224: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 
^D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-23^2-2 17 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 



TTTTGCCTTG GACTTTCATG ACA 2 3 



(2) INFORMATION FOR SEQ ID NO: 225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2362-270 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225: 



TCTCTCTTGG GTGGTTCCTC AAC 23 



(2) INFORMATION FOR SEQ ID NO: 226: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 
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(A) NAME/KEY: inicrosequencing oiigo 99-2 301- 329 . mis 1 
-(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 



AAAATGATGA ACCATATAC l'> 



(;M INKOKMATtON fc^OR SZQ ID NO: 22*7: 

(I) iJI-.:OOENCE CHARACTERISTICS: 

(A) LENGTH: 23 b.uui pair-. 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

{vi ) ORIGINAL SOURCE: 

(A) ORGANISM: Horno sjpiens 

(i:-:) FEATURE: 

(A) NAME/KEY: potential microsequoncirni oliijo 99-23b7-G I .inial 
(li) LOCATION: 1..23 

(:-:i) SEQUENCE DESCRIPTION: SEQ ID NO: 227: 



TA.-V^CATTTC ATTATTTCAG AAA 



(.:) INFORMATION FOR SEQ ID NO: 228: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-2371-93 .misi 

(B) LOCATION: 1 . . 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 



TTTTAAACTT TCCTAATACT TAG 



23 



wo 99/04038 



122 



PCT/IB98/0U93 



(2) INFORMATION FOR SEQ ID NO: 229: 

(i) .'SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(fU TYPE: NUCLEIC ACID 
(C) STRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

( i i.) Mt)LECULE TYPE: ONA 

(V i ) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapionu 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencincj oli.jo O'i-l^ 379 -200 . mii: 1 
{D) LOCATION: 1..23 

SEQUENCE DESCRIPTION: SEO ID NO: 229: 



i;AGA/ACTTCC TGTTGAACCT GTT 2 J 



(2) INFORMATION FOR SEQ ID NO: 230: 

(L) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(Q) TYPE: NUCLEIC ACID 

(C) 3TRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i>:) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-238 1-391 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 



AGTGGTCTTC AGGTTATTGG TAG 2 3 



(2) INFORMATION FOR SEQ ID NO: 231: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



wo 99/04038 



123 



PCTAB98/0U93 



(ii) MOLECULE TYPE: DNA 

(vi) OIUGINAL SOURCE: 

(A) ORGANIGM: llorno :f;opiens 

( Ia) TEATURE: 

(A) NAME/KEY: pot-.ontial microsoquotu: i ruf olifjo 1 1 i(iH . m L.m 1 

(H) LOCATION: I.. 2 J 

(:<i) :;L:Otir.NCE DECCIUPTION: i\^0 10 NO: 2Jl: 



ATTTTAAflACJ CAAAACTTAA TOO 2 i 



(2) INFORMATION TOR SEQ ID NO: 232: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
(TM TYPE: NUCLEIC ACID 

(C) 5TRANDEDNECG: SINGLE 

(D) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: microsequenciiuj oiiqc 90-::-; : I"! . mi.,; I 

(B) LOCATION: 1..19 

(:-:i) SEQUENCE DESCRIPTION: SZQ ID NO: 232: 



GATCAAGCAG TGCCCACTC 



(2) INFORMATION FOR SEQ ID NO: 233: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencinq oliqo 99-2559-253 . mis 

(B) LOCATION: 1..23 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 233: 
CAGGTGTTTT CATCCCCTCT TAG 2 3 

{;:) INroUMATlON for SEO to NO: 23*1: 

(i) i;KUtIENCE CIIAr<ACTEKISTICS: 

(A) LENGTH: 23 bJiic pair:5 

(M) TYPE: NUCLEIC ACID 

iC) MTRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

{vi ) ORIGINAL SOURCE: 

(A) ORGANISM: Komo sapiens 

Ux) FEATURE: 

(A) NAME/KEY: potential microsequonc ing oli^jo '.jy-iciijGb- 1 12 . :ni:j i 

(B) LOCATION: 1. .23 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 23^1: 
GCCTTCACAA CCGCAGAGGC AAG 2 3 



(2) rNFORMATIOM FOR SEQ ID NO: 235: 

(L) iiEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

TYPE: NUCLEIC ACID 
(C) STRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2567-329 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 235: 
CACTGTCAGA TATGAAATGA TGC 23 



(2) INFORMATION FOR SEQ ID NO: 236: 
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(i) SEQUENCE CHARACTERISTICS: 

^A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNE33: SINGLE 
(0) TOPOLOGY: LINEAR 

( i i ) M0LF:CULE TYPE: DNA 

(VL) ORIGINAL SOURCE: 

(A) ORGANISM: Homo :;apienj 

(i X) FEATURE: 

(A) NAME/KEY: microjoquonciruj oiloo M'»-xii7()-:i 1 U . mii; I 
(fO LOCATION: 

(:<1) SEOUENCE DESCRIPTION: SEQ ID NO: 230: 



ACTTCCAAAT TATGAGAAG 10 



{2} INFORMATION FOR SEQ ID NO: 237: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA 

(vi) OR IG I UAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: potential microsequencinq oii(jo 99-2571 -242 , misi 

(B) LOCATION: 1..23 

Ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 237: 



ATAATGAATG AGTATTTGAT ATT 23 



12) INFORMATION FOR SEQ ID NO: 238: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) rEATURE: 

(A) NAME/KEY: microsequencing oiigo ^0-20 10- K? 1 . iniyl 

(B) LOCATION: 1..23 

::i:onENCE OEGCRirTION: ^^EQ ID NO: 2M\: 



TTTTCCCTTC: CCTACGTATG GCT ^ 



{•/} [NL-'ORMATION FOR GEO ID NO: 239: 

(i) SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(H) TYPE; NUCLEIC ACID 

(C) STRANDEDNES5: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-2ul S-B3 .inij 1 

(B) LOCATION: 1..23 

{:-:L) IlEOUENCE DESCRIPTION: SEQ ID NO: 239: 



TTTTAATCAC AAGTTGGCAA AAG 



(2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oiigo 99-2620-221 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 240: 
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TTGACTGGGC TCCTGATGTG TCC 2 3 

(;:) TNrORMATION FOR SEQ ID NO: 2^1: 

(i) .^TOURNCE CHARACTERinTICn: 

(A) LENGTH: 23 base pnir^ 
(in TYPE: NUCLEIC ACID 

(C) ivruANDEDNE::::: single 

(U) TOPOLOGY: LINEAR 

(.ii.) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

i'lK) FEATURE: 

(A) NAME/KEY: potential microsequenciruj oli-jo 99-262 'I -^Ol . mis i 

(B) LOCATION: 1..23 

SEQUENCE DESCRIPTION: SEQ ID NO: 2^1: 
ATCTGGCCAT AGGCAGAACA TTG 2 3 

(2) INFORMATION FOR SEQ ID NO: 242: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(0) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY; potential microsequencing oiigo 99-2625-70 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: 
AGTGACTCAA CCAGAAAGAG AGC 23 

(2) INFORMATION FOR SEQ ID NO: 243: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: MUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 
iO) TOPOLOGY: LINEAR 

(U) MOLECULE TYPE: DNA 

(vl) tJKIGINAL SOURCE: 

(A) ()|U;ANIGM: Homo s.ipien:5 

( ix) i-'KATURE: 

(A) NAME/KEY: potent iai microaecnu^nci rnj dIi.jo jO-{i7 , m i:; 1 

(iM LOCATION: 1..23 

(:.:i) ::r:0UENCE OECCRI PTION : lIEO ID NO: ^-13: 
TA.AATTCTCC CTAGAAGATT AAG 2 J 

Ci) INFORMATION FOR SEQ ID NO: 2^^: 

ii) SEO'JENCE CHARACTERIGTIC2: 

(A) LENGTH: 23 base pairs 
(ii) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vl) ORIGINAL SOURCE: 

:A) ORGANISM: (lomo sapiens 

(l:-:) FICATURE: 

vA) NAME/KEY: microsequencinq oiiqo 0'»-2to 1- 1 20 . min 1 
(?) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 244: 
TTTTTATTTC TTCCCCTAGG CAA 2 3 



(2) INFORMATION FOR SEQ ID NO: 245: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

Ivi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 
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(A) NAME/KEY: potential microsequencing oiigo 99-263-;-3'l 1 .mis 1 
-(B) LOCATION: 1..23 

(Ni) SEQUENCE DESCRIPTION: SEQ ID NO: 245: 



(K^AATCAATA TTTATTTATT ATC ^ 



{•>.) rwrORMATION FOR SEQ ID NO: 2'l(i: 

(i) .'JEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(U) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: potential microsequGnc ir. ] oliijo ^)9-2637 -:iy . mii; I 

(D) LOCATION: 1..23 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 21G: 



CCATCACTTC CTCCTAGTGA AAA 23 



(2) INFORMATION FOR SEQ ID NO: 2^17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2612-255 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 247: 



TGAGGGTGTT TCCAGAAGAG ACT 



23 
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(2) INFORMATION FOR SEQ ID NO: 218: 

(i) PEOUENCE CHARACTERI5TICG: 

(A) LENGTH: 23 base pairs 

m TYPE: NUCLEIC ACID 

{('.) STRANDEDNEf^G: :UNGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

( V i ) OKIGINAL 20URCK: 

(A) ORGANISM: Homo sapioru: 

I^'EATURE: 

(A) NAME/KEY: potential microsequencincj olicjo ^^-liG'l 0- 1 1 U , mi:: 1 
(n) LOCATION: 1..23 

{:<i) SEQUENCE DESCRIPTION: SEO ID NO: 248: 



CACAAATTAA TTCCATTGTT ATA " 2 3 



(2) INFORMATION FOR SEQ ID NO: 249: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(R) TYPE: NUCLEIC ACID 

(C) GTRANDEDNESS: SINGLE 

(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM* Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-264 7-368 . mis 1 

(B) LOCATION: 1 . . 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 249: 



TTAAGGCCTT CAACTGATTA GAC 2 3 



(2) INFORMATION FOR SEQ ID NO: 250: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
tB) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: llotno sapiens 

(!>:) FEATURE: 

(A) NAME/KEY: micro.'.-tcquoncing oUcjo n<)-:-:(,.l ')- U)7 .mi:; I 
(LO . LOCATION: 

(>:i) :;E0UENCE description: i^EO ID NO: 2S0: 



ACTCTt'.ClAcX* CTTTTATCA I 



{2) rNFORMATION FOR SEQ ID NO: 251: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo s^ipiens 

(ix) t'EATURE: 

(A) NAME/KEY: potential microscqucnc inq ol:jo 'i<i-2 1 0 "1-270 . ini:i2 
(U) LOCATION: 1..23 

(:-:i) rJEQUENCE DESCRIPTION: SEQ ID NO: 2:il: 



AGTGTAGAAA AATTGAAGGT CTG ^3 



(2) INFORMATION FOR SEQ ID NO: 252: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE; 

(A) ORGANISM: Homo sapiens 

lix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2228-301 . nvis2 

(B) LOCATION: 1..19 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252: 



OGCCTTGCCC ATATGGGTC r> 



C!) tNFOKMATION FOR SEQ ID NO: 233: 

(1) i:EOUENCE CIIARACTEIUSTICS: 

(A) LENGTH: 19 baao pjvirs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(U) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 9^-22:0-2'! 0 . mi:;2 

(B) LOCATION: 1..19 

(XL) SEQUENCE DESCRIPTION: SEQ ID NO: 253: 
AAGGACTGGA ACAGGTAGT 1 -» 



{:>) INFORMATION FOR SEQ ID NO: 25-»: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

iii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2240-28 1 . mis2 
{B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 254: 
AGAAAAAAAA GATTCGAATT ACT 23 



(2) INFORMATION FOR SEQ ID NO: 255: 
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(i) SCQUENCE CHARACTERISTICS: 

iA) LENGTH: 23 base pairs 
(13) TYPE: NUCLEIC ACID 
(C) STRANDEDNES.^: SINGLE 
(0) TOPOLOGY: LINEAR 

(ill Mtn.ECULE TYPE: ONA 

{VL) ORLGINAL SOURCE: 

(A) ORGANISM: Homo :»apienu 

( i:<) L-'KATURE: 

(A) NAME/KEY: poUontial microuGqucncinq oliviu 'tt»-:>2'l :?-X0() . ini ::2 
(H) LOCATION: I.. 2 3 

SEOUENCE DESCRIPTION: SEO ID NO: 255: 
CTTA/\GAAAA AAGTAAAATA TAA 2 J 



{I'D INFORR^TION FOR SEQ ID NO: 25G: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:<) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-22'l'l -0 3 -mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 256: 
TACCTACATG GTTTAAGCA 19 



(2) INFORMATION FOR SEQ ID NO: 2 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME/KEY: inicroi^equenc inq oiigo 90-22'l 0-34 0 . ini;0 
(h) LOCATION: 

(::U 1:E0UENCE DESCRIPTION: CIEQ ID NO: 2:37: 



'I't;rAAAACTT ATTTTTCTT 1*) 



(::) LNrUKMATfON rOR SEO id NO: 258: 

(i.) l^EODENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(LM TYPE: NUCLEIC ACID 
(C) STRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A) ORGANISM: Homo sapiens 

(!:■:) FEATURE: 

(A) NAME/KEY: potential microsequoncinq oiicjo *.)9-22'HJ-7G . nus2 

(B) LOCATION: 1 . . 23 

(:-:i) SEQUENCE DESCRIPTION: SEQ ID NO: 2138: 
CCAGGGGATG GGCAGACTTC AGG 2 3 



{2} INFORMATION FOR SEQ ID NO: 259: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2250-236 . mis2 

(B) LOCATION: 1..23 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 259: 
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AAT.^AAAATA AAAAACCCAA AGT 



{;.:) INFORMATION FOR SEQ ID NO: 260: 

(i.) :;i-:ohe;nce character istici";: 

(A) LENGTH: 23 ba.ic pairs 

(H) TYPE: NUCLEIC ACID 

iC) CTRANDEDNESU: IWNGLE 

([)) TOPOLOGY: LINEAR 

(il) MOLECULE TYPE: ONA 

(vl) ORIGINAL SOURCE: 

(A) ORGANISM: Homo luipieni; 

(ix) FEATURE: 

(A) NAME/KEY: rnicrosequencinq oliqo 'J'J-lj!21j I - 1 13 1 . inii:li 
(13) LOCATION: 1..23 

(:-:i) SEQUENCE DESCRIPTION: SEQ ID NO: 2G0: 
TT'ITACACCC A.^.^CTAGTCT ATC 2 3 



(2) INFORMATION FOR SEQ ID NO: 2G1: 

(i) r.:EOUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
TYPE: NUCLEIC ACID 

(C) STRANDEDNE55: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: rnicrosequencinq oligo 99-2269-179 .mis2 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 261: 
TTGATCTTGA TAGGCTGTA 1^ 



(2) INFORMATION FOR SEQ ID NO: 262: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) 5TRANDEDNESS: SINGLE 
iD) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiona 

(ix) fKATURE: 

(A) NAME/KEY: microitoqiuMicinq olitjo ::*/ 1 0 J , mi 

(R) LOCATION: 

[:<x) tn-'.OUENCE nEl'^CIU TTION : SEQ ID NO: 2ti::: ' 



GAAGATAAGA AAATCAAGG I'.) 



(2) INFORMATION FOR SEQ ID NO: 2G3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(U) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: microsequencincj oiigo 99-2272-1 00 . mis2 
(3) LOCATION: I.. 23 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 263: 



TTTTACTTGC AATATTTCAC ACT 2 3 



(2) INFORMATION FOR SEQ ID NO: 264: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 
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(A) NAME/KEY: potential microsequencing ol.igo 99-2273-528 .mis2 
^B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26^: 



a(;(;;c:atttat ttcatattta tta 



CM INI-ORMATION FOR SEQ ID NO: 265: 

(i.) :;i:0UENCE CIIARACTERIGTICS: 

(A) LENGTH: 2 3 ba^io pairs 
(D) TYPE: NUCLEIC ACID 

(C) GTRANDEONESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) 0RGANI:3M: Homo sapiens 

( i ) F'ZATURE : 

(A) NAME/KEY: potential microsequonciri') oli-jo 99-227 5-^ GG . :nii;2 
(D) LOCATION: 1. .23 

(vi) :";E0UENCE description: GEO ID NO: 205: 



TAC.TATCCCT ATTCACAGTT TTT 



(2) INFORMATION FOR 5EQ ID NO: 266: 

(!) CEOUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
(D) TYPE: NUCLEIC ACID 
(C) STRANDEDNESS: SINGLE 
ID) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2273-27 6 . mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 



TTGTTGGAGA TGCACAGGC 



19 
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(2) INFORMA'HON FOR SEQ ID NO: 2Ci7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) 5TRANDEDNES?: SINGLE 

(D) TOPOLOGY: LINEAR 

( i i.) MOLECULE TYPE: DNA 

{ vi ) OKIGINAL SOURCE: 

(A) ORGANISM: Homo japionj 

(i:-:) FEATURE: 

(A) NAME/ KEY: potential microsoquencinq oliqo '-V»-2 312-:i5U .mi:^2 

(B) LOCATION: 1..23 

SEQUENCE DESCRIPTION: SEQ ID NO: 267: 



tiCCCATTTAC CCAGAAGGCC TAC 2 3 



(.') [NFORMATION FOR SEQ ID NO: 2GU: 

(i) SEQUENCE CHARACTERISTICS: . 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-2315-2 1 3 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 268: 



TTTTTTTTAA AATAAGGTTT TCT 23 



(2) INFORMATION FOR SEQ ID NO: 269: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: llumo s.ipiens 

(i.x) FEATURE: 

(A) NAME/KEY: potnntiji micro:?equi"![u.M tui oUtjo on-2.320-;:!»X . mi 

(B) LOCATION: 1..23 

.".RUUENCE DEnCRT [TU)N : :»E0 10 NO: 



AAATTCATTA AATTTATAAA AAC ^^^ 



(2) INPORMATION FOR SEQ ID NO: 270: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(D) TYPE: NUCLEIC ACID 
(C) STRANDEDNEGS: SINGLE 
([)) TOPOLOGY: LINEAR 

tii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:<} FEATURE: 

(A) NAME/KEY: potGnCiai microscquonci :. oli^io 40-232 1-U2 .mii2 

(B) LOCATION: 1..23 

(:-;i) :3EQUENCE DESCRIPTION: SEQ ID NO: 270: 



AGGAAATATT TGAGTAGGTA TCC 



(2) INFORMATION FOR SEQ ID NO: 271: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2324 -338 . mis 

(B) LOCATION: 1..23 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 
TTTTTACTGT TGAGGGATTT CTT 2 J 

C) tNP'OKMATION FOR SEQ ID NO: 272: 

(i.) ilianiENCE CIIARACTERIGTICr;: 

(A) LENGTH: 23 bjae pairs 
{in TYPE: NUCLEIC ACID 
(C) :.vrRANDEDNE:;i^: :UNGLE 
{{)) TOPOLOGY: LINEAR 

(li) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: niicrosequencing oligo 99-233:i-'l2 3 .iniiil^ 

(B) LOCATION: 1. .23 

(xi) ilEOUENCE DESCRIPTION: SEQ ID NO: 272: 
TTTTCCTGTC CAA.ATGTTGA ACA 23 

(2) INFORMATION FOR SEQ ID NO: 273: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-234 1-4 85 . mis2 
IB) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 3: 
AGAGCCTGTG CGATTCTTTG TAA 2 3 



(2) INFORMATION FOR SEQ ID NO: 274: 



wo 99/04038 



141 



PCT/IB98/0U93 



(i) SEQUENCE CHARACTERISTICS: 

-(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) GTRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(il) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo snpions 

(i:<) n-:ATURE: 

(A) NAME/KEY: potential microiioquenciiui oLiqo '»*)-2 J l:^-:a7 .ini:j2 

(D) LOCATION: I.. 23 

(;-:i) SEQUENCE DESCRIPTION: SEQ ID NO: 21A: 



CACAATCTGG CCTGTTTCTA GAA 2 J 



(2) INFORMATION FOR SEQ ID NO: 275: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 bdse pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2 362-270 , mi32 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 5: 



TTTTACTTGG TCAAGGTCAC ACA 23 



(2) INFORMATION FOR SEQ ID NO: 276: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME/KEY: potential inicrosequencing oli<.:|o 99-23(M -329 . n\i.'32 

{D) LOCATION: I.. 23 

SEQUENCE DEr^CRI PTION : SEQ ID NO: 27f>: 



CAATTATATC 1'TACCTTGCC TCA 2..i 



(::) INL-'ORMATION EOR 5E0 ID NO: 277: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairis 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencintj oligo 94-23G7-G1 .mii;2 
(Ii) LOCATION: 1..23 

(:-:i) SEQUENCE DESCRIPTION: SEQ ID NO: 277: 



TTTTTTGGTG AAJXATGCATA TTA 2 3 



(2) INFORMATION FOR SEQ ID NO: 278: 

li) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE; NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-237 1-93 . mis2 

(B) LOCATION; 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 278: 
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ATTATGTAAA AAGTAGGCAG TGA 2 3 



(2) INrORMATION FOR SEQ ID NO: 279: 

(i) :'EOlin:NCE CHARACTKRI^TIC:^: 

(A) LENGTH: J 9 ha:;i' p.iir:; 
(U) TYTE: NUCLEIC ACID 
{C) STRANOEDNEi;:.;: lUNGLE 
(D) TOPOLOGY: LINEAR 

(i i) MOLECULE TYPE: UNA 

(v,i) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: inicrosequencing olicjo 99-237t^-200 . mis2 
(D) LOCATION: 1..19 

(:-:il ;:E0UENCE DESCRIPTION: SEQ ID NO: 279: 
GGACCACAGG ACAGTTCTA I'* 



(2) INFORMATION FOR SEQ ID NO: 280: 

(U ::E0UENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
(Ti) TYPE: NUCLEIC ACID 
(C) STRANDEDNE5S: SINGLE 
(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2 38 1-394 .mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 280: 
TTTAGCTCCC CTACTTTTC 



(2) INFORMATION FOR SEQ ID NO: 281: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) 5TRANDEDNESS: SINGLE 
fD) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) OKIGINAL SOURCE: 

(A) ORGANIGM: Homo :^apicnr. 

(ix) KKATURE: 

(A) NAME/KEY: microiioquenciruj oiiqo MM-:m It- MiH . tn i:;;: 
(U) LOCATION: 1 . . 19 

J'll-X^JENCE OEtlCRI PTION : GEO ID NO: AXU : 
AAATATTATG TACAATTCT 1 *J 



{2) INFORMATION FOR SEQ ID NO: 2U2: 

(i) i:EOUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

iU) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: ilomo Scipicns 

(i:-:) FEATURE: 

(A) NAME/KEY: potonciai microsequcricinr] oiifjo 09-211 ')-:^U 5 . inis2 

(B) LOCATION: 1..23 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 282: 



AGTCACAGCT CCCTGGAGGG TGG 2 3 



(2) INFORMATION FOR SEQ ID NO: 283: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 
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(A) NAME/KEY: microsequencing oiigo 99-2559-25 3 . misC 
iB) LOCATION: 1..19 

(,xi) SEQUENCE DESCRIPTION: SEQ ID NO: 283: 



t-lA'rGGATGAT CTGACACAC I '» 



[A) INmUMATION FOR GEO ID NO: 2iH : 

(L) i;KUUENCE CUARACTERIilTIC::: 

{A) LENGTH: 19 baae pair:s 
(13) TYPE: NUCLEIC ACID 
(C) GTRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{.l:0 FEATURE: 

(A) NAME/KEY: mici-osequencinq oligo 99-2 50u- 1 1 2 . mis2 

(B) LOCATION: 1..19 

SEQUENCE DESCRIPTION: SEQ ID NO: 284: 



Ac,;t;f.;TGGC(:A agctccttc i"^ 



{?,) INFORMATION FOR SEQ ID NO: 285: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-2567-329 .mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 285: 



TATAGCCCAA AGAAAGCCA 



19 
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(2) INFORMATION FOR SEQ ID NO: 286: 

(i) .'SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(H) TYPE: NUCLEIC ACID 
(C) r^TRANDEDNE.^ J : SINGLE 
(H) TOrOLOGY: LFNEAR 

(ii) MOLECULE TYPE: DNA 

(vl) OKJCINAL SOURCE: 

(A) ORGANISM: Homo japionu 

iiK) FEATURE: 

(A) NAME/KEY: potential microsequencintj oLitjo OO-;!07()-2 1 0 . mis^ 

(B) LOCATION: 1,.23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 286: 



A.ACTTAGCCA CTTCAGAGGC CTC 2 3 



(2) INFORMATION FOR SEQ ID NO: 287: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(t)) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-257 1-242 .mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 287: 



GCTGACACAT TTAATTATA 19 



(2) INFORMATION FOR SEQ ID NO; 288: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

( ix) ri':ATURE: 

(A) NAME/KEY: potonti.il microi^cquor..: i nq olitio ')n-X(i ) 0- 1 ;M , ini.-:;: 
(H) LOCATION: \..2 3 

(xi) r.lT.OUENCE DESCRIPTION: SEQ ID NO: 2U8; 



CiACAA'PTTTC TAAGTGCACC ATA 2 A 



{2) INFORMATION FOR SEQ ID NO: 289: 

(1) ilEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNE5S: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(L:-:) FEATURE: 

{A) NAME/KEY: potential microsoqu'jr. _ oliqo C>-H3 .mi j2 

{B) LOCATION: 1..23 

(xi) rJEQUENCE DESCRIPTION: SEQ ID NO: 28^*: 



TTCCCAGAAG ATGAGAATTT GCT 



(2) INFORMATION FOR SEQ ID NO: 290: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2620-227 . mis2 

(B) LOCATION: 1..23 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 290: 



TTTT.AACACC CAGCAACATA CCC 2 



{::) INKOKMATION FOR SEQ ID NO: 291: 

(i) :;kouence characteristic^: 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) r.TRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(li) MOLECULE TYPE: UNA 

(VL) ORIGIMAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: inicrosequencing oiitjo 99-2(.2-l -'1 07 . mi;j2 
(D) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 291: 



TTTTCTCTTT CCCCATCTCT CCC 



C^l LNFORMATEOtJ FOR SEQ ID NO: 292: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNES5: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2625-70 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 292: 



TTTTCTCTCT TCKTCCTCTC TCC 



(2) INFORMATION FOR SEQ ID MO: 293: 
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(i) SEQUENCE CHARACTERISTICS: 

tA) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNE^JG: SINGLE 

(D) TOPOLOGY: LINEAR 

( i L) MOLECULE TYPE: ONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapicnj 

( i :-:) Fb-.ATURE: 

(A) NAME/KEY: microseqiioncintj oiiyo ^)')-2 0.u)-(i7 . mi:.::^ 
(13) LOCATION: 1..23 

ilEOUENCE DESCRIPTION: SEQ ID NO: 293: 



TTTTACTCCC TGTTCTGGAC CAA 2 J 

(2) INFORMATION FOR SEQ ID NO: 294: 

(t) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

Hi) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing olicjo 99-2633- 12 9 . mis2 

(B) LOCATION: L..23 

t>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 29^: 
TCAAGGGTTC TCTCATTGTC TAC 23 



(2) INFORMATION FOR SEQ ID NO: 295: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
tix] FEATURE: 

(A) NAME/KEY: inicrosequencing oiigr 99-2 o 3 1 - 31 I . mi ? ."^ 

(D) LOCATION: 1..23 

SICOIIENCE OEGCRIPTION: ID NO: 295: 



TTTTTAAATA ATCTCTCACC TGT A i 



(J) INFOUMATION ETOR SEQ ID NO: 296: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(D) TYTE: NUCLEIC ACIO 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi ) ORIGINAL SOURCE: 

<A) ORGANISM: Homo sapiens 

{i:-:) FEATURE: 

(A) NAME/KEY: microsequencing oligc 9^-2 C37-2U . mi j J 
(D) LOCATION: 1..23 

(xi) :;E0UENCE DEGCRITTION: SEQ ID NO: 296: 



TTTTAAAACC CACCCTCCTT TGA 2 3 



(2) INFORMATION FOR SEQ ID NO: 297: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-264 2-255 . mis2 

(B) LOCATION: L..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 297: 
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TCACTTCAGA TTCAAATGC 1? 

{2) INFORMATION FOR SEO ID NO: 298: 

(I) .-i^liOUENCE CHARACTERinXICS: 

(A) LENGT!!: 23 Imsc p.Urs 
(U) TYPE: NUCLEIC ACID 

(C) STRANDEDNES.T: lUNGLE 

(D) TOrOLOGY: LINEAR 

(U) MOLECULE TYPE: t)NA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: inicrosequencing oliqo 99-:iG; 5-1 18 .mis2 

(B) LOCATION: 1 , . 23 

SEQUENCE DESCRIPTION: SEQ ID NO: 290: 
TTTTCAGATT CTTCATTGCT AGC 2 3 



(2) INFORMATION FOR SEQ ID NO: 299: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 bdSQ pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
iO) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-26-17-368 . mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 299: 
AGATAATGTG AGTGGGCCT 



(2) INFORMATION FOR SEQ ID NO: 300: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 
AO) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapien:? 

(i>:) l-'KATURE: 

(A) NAME/KEY: potential nucro:3oquoncin<] olitji) tM m- 1 07 . m i ::X 

(10 LOCATION: I.. 2 3 

(;<i) .MKOUl-NCE DEl^CRI PTION : DEQ ID NO: 300: 



AGTTTCAGTG CATTGCTGTC CTG 2 3 



{2) INFORMATION FOR SEQ ID NO: 301: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(13) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi, ) ORTGIMAL SOURCE: 

(A) ORGANISM: Homo sapiens 

ti:-:) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-3'14 
(D) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 2 4 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-344-misl 
(BV LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-344-mis2 

(B) LOCATION: complement 25.. 4 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 301: 



TGCTGCCAAG GATCCATGTC AGCATGCTCC TCTCTGAGCC CTGGTCT 



(2) INFORMATION FOR SEQ ID NO: 302: 
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(i) SE(?UENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 
(P) TYPE: NUCLEIC ACID 

(C) STRANDEDNECn: SINGLE 

(D) TOPOLOGY: LINEAR 

MOLECULE TYPE: DNA 

(v.i) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(L:<) FEATURE: 

(A) NAME/KEY: polymorphic L=rjgmont '.)*)-.i()0 

(B) LOCATION: 1..47 

ily.) FEATURE: 

(A) NAME /KEY: polymorphic base 

(D) LOCATION: 2A 

(D) OTHER INFORMATION: base t 

(i:-:) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-36b-mi3l 
{H) LOCATION: 5.. 23 

(i:-:) FEATURE: 

(A) NAME/KEY; Potential microsequencing oiigo 90- 3Gb-::\is2 
(D) LOCATION: complement 25.. 47 

(:-:i) IJEQUENCE DESCRIPTION: SEQ ID NO: 302: 
AC.GGCCTGGC TTCAGGGACA CCTTAGGAAj\ TGTTTGTTGA GTTAGTG 



{■/) INFORMATION FOR SEQ ID NO: 303: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM; Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-359 

(B) LOCATION: l.,4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 
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FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo '.)9-359-mis 1 

(B) LOCATION: 1..23 

( j X ) l-'EATURE : 

(A) NAME/KEY: micro.-scnuencing oligo 9')- Jfin-m 

(B) LOCATION: compicmcnt 20.. 43 

IJEOUENCE DESCRIPTION: SEQ ID NO: 303: 



CTACACACTC ATCCCCTCCA TCCCGTCTCA ACAAATCCTG GCACCTC -17 



(2) INFORMATION FOR SEQ ID NO: 304: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(fi) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: ONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-3r>5 
(U| LOCATION: l.,47 

(i;-:) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-355-rnisl 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-355-mis2 

(B) LOCATION: complement 25,. 43 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 304: 
GGAGTTTCGG GGAGTTTCGG GAGGGTTCCT GGGAAGAAGC TCCTCCC 4 7 

(2) INFORMATION FOR SEQ ID NO: 305: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 
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(B) TYPE: NUCLEIC ACID 
to STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo Jvipions 

(ix) Ft:ATURE: 

(A) NAME/KEY: polymorphic fracjmont 9*)-3G'j 
(IM LOCATION: 

{i:<) fEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(0) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: microsequencinq oiicjo 99--JG5-:ni:3 1 

(B) LOCATION: 5.. 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencincj oU-jo 99- 365-ini:32 
(D) LOCATION: complement 25.. ^8 

{:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 300: 
CCTACX'AACC AAGCAGCCCC AGCCTAGGGT CAGACAGGGT GAGCCTC 4'? 



(2) IIIFORMATION FOR SEO ID NO: 306: 

(L) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: ^7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS; SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2452 

(B) LOCATION: I , . Al 

(D) OTHER INFORMATION: Extracted from sequence gb:M10065 

(3909. . 3955) 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 
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(ix) FEATURE: 

iA) NAME/KEY: microsequencing oliqo 99-24 SZ-mis 1 

{P) LOCATION: 5.. 23 

(is) FI-ATURE: 

(A) NAME/KEY: Potential microsequcncinq oliqo *)*)-24 52-mis2 

(IM LOCATION: complement 2D.. 47 

(xi) r^EOUENCE DEGCRITTION: SEO ID NO: 306: 



'I'(iCGCGCr;GA CATGCAGGAC GTGCGCGGCC GCCTGGTGCA GTACCGC 4 7 



(2) INFORMATION FOR 3EQ ID NO: 307: 

(i) SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNES5: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A} ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-344 
(13) LOCATION: 1..47 

(O) OTHER INFORMATION: variant version of SlO ID30i 

(ix) FEATURE; 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEO ID30X 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-344-misl 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-344-mis2 

(B) LOCATION: complement 25.. 4 3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 307: 



TGCTGCCAAG GATCCATGTC AGCGTGCTCC TCTCTGAGCC CTGGTCT 



(2) INFORMATION FOR SEQ ID NO: 308: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 
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(B) TYPE: NUCLEIC ACID 
-(C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:<) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-:Jbti 
(U) LOCATION: I. .47 

(D) OTHER INFORMATION: variant version o£ ::i:0 XOJ02 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 
(D) LOCATION: 2^1 

(D) OTHER INFORMATION: base c; t in SEQ ID302 

(i:-:) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-366-misl 
(D) LOCATION: 5. .23 

(!:•:) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-3o6-mis2 

(B) LOCATION: complement 25.. 47 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 306: 



AGCGCCTGGC TTCAGGGACA GCTCAGGAAA TGTTTGTTGA GTTAGTG 4 7 



{2) INFORMATION FOR SEQ ID NO: 309: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-359 

(B) LOCATION: 1..4 7 

(D) OTHER INFORMATION: variant version of SEQ ID303 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a; g in SEQ ID303 
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(ix) FEATURE: 

1A) NAME/KEY: Potential microsequencing oliqo 99-359-mis 1 
(B) LOCATION: 1..23 

FEATURE: 

(A) NAME/KEY: microsequencing oligo 9'i-35^-ini;=;2 

(B) LOCATION: oomploincnt 25.. 43 

(xi) M'.UUENCE DESCRITTION: HEQ ID NO: 30!): 
ctaca(;agtc: atcccctcca TCCAGTCTCA ACAAATCCTG CCACCTC 4 7 



C:) INFOrNMATION FOR SEQ ID NO: 310: 

{i) GEOUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 
(D) TYPE: NUCLEIC ACID 
iC) 5TRANDEDNE3S: SINGLE 
(D) TOrOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-355 
(n) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ 10304 

(l:<) FEATURE: 

(A) NAME/KEY: polymorphic base 
(D) LOCATION: 24 

(D) OTHER INFORMATION: base a; g in SEQ ID304 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-355-misl 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-355-mis2 

(B) LOCATION: complement 25.. 43 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 310: 
GGAGTTTCGG GGAGTTTCGG GAGAGTTCCT GGGAAGAAGC TCCTCCC ^'^ 



(2) INFORMATION FOR SEQ ID NO: 311: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 
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(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi.) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i.x) f'lr.ATURE: 

(A) NAME/KEY: polymorphic fragment UO-ifiO 
(U) LOCATION: l..^tj 

(0) OTHER INFORMATION: variant version ot l^ZO IDJOS 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 
(D) LOCATION: 2A 

(0) OTHER INFORMATION: base t; c in SEQ ID335 

(i:-:) FEATURE: 

(A) NAME/KEY: microsequencing oliqo 90-3G5-r;isl 
(D) LOCATION: 5.. 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oli.jo 99-365-inis2 

(B) LOCATION: complement 25.. 48 

(y.i) SEQUENCE DESCRIPTION: SEQ ID NO: 311: 



CCTACCAAGC AAGCAGCCCC AGCTTAGGGT CAGACAGGGT GAGCCTC 4 7 



(2) INFORMATION FOR SEQ ID NO: 312: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2452 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID306 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID306 
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(ix) FEATURE: 

(-A) NAME/KEY: mi crosequencing oligo 99-24 52-inis 1 

(B) LOCATION: 5.. 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencinq oligo 52-mis2 
(IM LOCATION: oompioment 25.. 47 

(Ni) SKUUENCE DEHCRI PTION : ZtO ID NO: 312: 
TCCGCGCCICA CATOGAGGAC GTCTGCCGCC GCCTCGTCCA GTACCGc: 4 7 

(2) INFORMATION FOR 5EQ ID NO: 313: 

(i) SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(li) TYPE: NUCLEIC ACID 

(C) STRANDEDNESG: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer tor SEQ ID301 and 'SIL 

I[)",107 

(B) LOCATION: 1..20 

(:<i) SEQUENCE DESCRIPTION: GEQ ID NO: 313: 

GCTCTCATAT TCATTGGGTG 20 

(2) INFORMATION FOR SEQ ID NO: 314: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID302 and S 

ID308 

(E) LOCATION: 1..18 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 311: 



TCTCTCCCGT GTTAAATG 10 



(::) [NL'OUMATION FOR SEQ ID NO: 315: 

(i) i'.EOUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(LO TYPE: NUCLEIC ACID 

(C) STRANDEDNESG: SINGLE 
(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primor Lor SEQ ID303 S 

ID300 

(D) LOCATION: 1, .18 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 315: 



AATCTTCTTG CTCCTGTC 1" 



{2) INFORMATION FOR SEQ ID NO: 316: 

(i) SEQUENCE CtlARACTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:<) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID304 and 

ID310 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 316: 



AGGTTAGGGG TGTATTTC 



18 
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(2) INFORMATfON FOR SEQ ID NO: 317: 

(l) sequence CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
(FM TYPE: NUCLEIC ACID 

(C) STRANDEDNES::: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo s.ipicn:; 

(i:<) FEATURE: 

(A) NAME/KEY: upstream ampli ticaCion primer for SEO 10305 and :JKO 

[0311 

(D) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEO ID NO: 317: 

AGACTGTGAC CTTAGACC IB 



Ci) INFORMATION FOR SEQ ID NO: 318: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 
to STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID306 and SEQ 

ID312 

(B) LOCATION: 1..18 

(D) OTHER INFORMATION: Extracted from sequence gb:M10065 

(3791. . 3808) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 318: 



GACGAGACCA TGAAGGAG 



(2) INFORMATION FOR SEQ ID NO: 319: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
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(B) TYPE: NUCLEIC ACID 
to STRANDEDNES3: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: ONA 

(vi) ORIGINAL SOtmCE: 

(A) ORGANISM: Homo :-:opicns 

(ix) FEATURE: 

(A) NAME/KEY: tlownr.tream amplification prir.or tor 'JEQ ID30i arui 

:;n:o I0307 

(B) LOCATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 319: 

TGGCTGCGGT TAGATGCTC 19 



(2) INFORMATION FOR SEQ ID NO: 320: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: .18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification prir.or for SEQ ID302 and 

SEQ ID308 

(B) LOCATION: 1..18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 320: 



AGGGGTAACT CTTGATTG 



(2) INFORMATION FOR SEQ ID NO: 321: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
{ix} FEATURE: 

(A) NAME/KEY: downstream amplification primer for oEQ ID303 and 

(B) LOCATION: 1..18 

(xi) ;:E0UENCE DESCRirriON: 5EQ ID NO: 321: 



ACCAAGCCAT ACCTTCTC IH 



{?,) INFORMATION FOR SEQ ID NO: 322: 

(i) ilEOUEMCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) iVrRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primc.-r tot SEQ ID304 and 

::r:n iD3iO 

(B) LOCATION: 1..I8 

(:.:i) riEOUEMCE DESCRIPTION: SEQ ID NO: 322: 

ATACAGCCAG GGAGATAG IB 



(2) INFORMATION FOR SEQ ID NO: 323: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID305 and 

SEQ ID311 

(B) LOCATION: 1..18 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 323: 



AATTGCTACC CCCAATTC 



{2) CNFOaMATtON FOR GEQ ID NO: ZZ^\ 

(L) i^ir.OUENCE CMARACTERISTICS: 

(A) LENGTH: 10 ba^o pairs 
(U) TYPE: NUCLEIC ACID 

[C) STRANDEDNESr.: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: downst:-eam amplification primor tor 5E0 ID30G and 

SZO ID312 

(B) LOCATION: l.,18 

(D) OTHER I N FORMAT lOM: Extracted from iioquence cjb:M100Gr3 
(complement 1370.. 4395) 

(:-:i) SEQUENCE DESCRIPTION: SEQ ID NO: 324: 



TCf;A.ACCACC TCTTGAGG 



(2) INFORMATION FOR SEQ ID NO: 325: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-314. mis 

(B) LOCATION: 1,.23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 325: 



TGCTGCCAAG GATCCATGTC AGC 



23 
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(2) INFORMATION FOR SEQ ID NO: 326: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 bjs.^ pairs 

(to TYPE: NUCLEIC ACID 

(C) r.TRANDEDNESr^: lUNGLE 

(0) TOPOLOGY: MNKAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo ifapicns 

(ix) FEATURE: 

(A) NAME/KEY: microsequencinq oiigo 99-366. misl 

(B) LOCATION: 1..19 

(:-:i) SEOUENCE DESCRIPTION: SEO ID NO: 326: 



CCTGGCTTCA GGGACAGCT 1 



[2) INFORMATION FOR SEO ID NO: 327: 

(i) SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base poirs 

(0) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencinq oligo 99-359. misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 327: 



CTACAGAGTC ATCGCCTCCA TCC 2 3 



(2) INFORMATION FOR SEQ ID NO: 328: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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{ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: por-.(intinl micronoqnonc i nri oli(!0 n*)- 3!>!'i . in i:; I 

(B) LOCATION: i.,23 

(si) i;i':OUENCE DESCRirnON: GEO ID NO: 320: 



CC^ACTTTCCC GGACTTTCGG GAG 2i 



{2) INFORMATION FOR SEQ ID NO: 329: 

(i) l^EQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) 3TRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(iv) FEATURE: 

(A) NAME/ KEY: microsequencinrj oiiqo jC)'") . mi 1 
{B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 329: 



CCAAGCAAGC AGCCCCAGC 



(2) INFORMATION FOR SEQ ID NO: 330: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-24 52 . misl 

(B) LOCATION: 1..19 



wo 99/04038 



158 



PCT/IB98/01193 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 330: 
CGCGGACATG GAGGACGTG 19 

{;!) [NFOKMATION t'OR SEQ ID NO: 331: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 baso pairs 

(0) TYPE: NUCLEIC ACID 

(C) STRANDEDNE5S: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-3 ^; .mi32 
(D) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 331: 
CAGGGCTCAG AGAGGAGCA 19 



{A) INFORMATION FOR SEQ ID NO: 332: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oiigo 99-366. mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 332: 
CACTAACTCA ACAAACATTT CCT 23 



(2) INFORMATION FOR SEQ ID NO: 333: 
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(i) SEQUENCE CHARACTERISTICS: 

-(A) LENGTH: 19 base pairs 
(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo j.ipicns 

(ix) FEATURE: 

(A) NAME/KEY: microaequcncinq oligo 99- 3S*.t . miu^ 
(U) LOCATION: 1,.19 

(>:i) SEQUENCE DESCRIPTION: SEO ID NO: 333: 



TCCCAGGATT TGTTGAGAC 19 



[2) INFORMATION FOR SEQ ID NO: 334: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(VL) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: mlcrosequencing oligo 99-3S5.mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 334: 



GGAGCTTCTT CCCAGGAAC 19 



(2) INFORMATION FOR SEQ ID NO: 335: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-365. mis2 

(D) LOCATION: 1..2 3 

(xi) .SEQUENCE DECCRITTION: ZZQ ID NO: 335: 



GAGCCTCACC CTCTCTGACC CTA 2 3 



Cn INTORMATION FOR SEQ ID NO: 336: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 52 . mis2 

(D) LOCATION: 1,.23 

(xi) IIEQUENCE DESCRIPTION: SEO ID NO: 336: 



GCGGTACTGC ACCAGGCGGC CGC 



23 



